• TensorFlow是一个面向数值计算的通用平台,可以方便地训练线性模型。下面采用TensorFlow完成Andrew Ng主讲的Deep Learning课程练习题,提供了整套源码。

线性回归

多元线性回归

逻辑回归

线性回归


# -*- coding: utf-8 -*-
"""
Created on Wed Sep 6 19:46:04 2017 @author: Administrator
""" #!/usr/bin/env python
# -*- coding=utf-8 -*-
# @author: ranjiewen
# @date: 2017-9-6
# @description: compare scikit-learn and tensorflow, using linear regression data from deep learning course by Andrew Ng.
# @ref: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex2/ex2.html import tensorflow as tf
import numpy as np
from sklearn import linear_model # Read x and y
#x_data = np.loadtxt("ex2x.dat")
#y_data = np.loadtxt("ex2y.dat") x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.1 + 0.3+np.random.rand(100) # We use scikit-learn first to get a sense of the coefficients
reg = linear_model.LinearRegression()
reg.fit(x_data.reshape(-1, 1), y_data) print ("Coefficient of scikit-learn linear regression: k=%f, b=%f" % (reg.coef_, reg.intercept_)) # Then we apply tensorflow to achieve the similar results
# The structure of tensorflow code can be divided into two parts: # First part: set up computation graph
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b loss = tf.reduce_mean(tf.square(y - y_data)) / 2
# 对于tensorflow,梯度下降的步长alpha参数需要很仔细的设置,步子太大容易扯到蛋导致无法收敛;步子太小容易等得蛋疼。迭代次数也需要细致的尝试。
optimizer = tf.train.GradientDescentOptimizer(0.07) # Try 0.1 and you will see unconvergency
train = optimizer.minimize(loss) init = tf.initialize_all_variables() # Second part: launch the graph
sess = tf.Session()
sess.run(init) for step in range(1500):
sess.run(train)
if step % 100 == 0:
print (step, sess.run(W), sess.run(b))
print ("Coeeficient of tensorflow linear regression: k=%f, b=%f" % (sess.run(W), sess.run(b)))
  • 思考:对于tensorflow,梯度下降的步长alpha参数需要很仔细的设置,步子太大容易扯到蛋导致无法收敛;步子太小容易等得蛋疼。迭代次数也需要细致的尝试。

多元线性回归


# -*- coding: utf-8 -*-
"""
Created on Wed Sep 6 19:53:24 2017 @author: Administrator
""" import numpy as np
import tensorflow as tf
from numpy import mat
from sklearn import linear_model
from sklearn import preprocessing # Read x and y
#x_data = np.loadtxt("ex3x.dat").astype(np.float32)
#y_data = np.loadtxt("ex3y.dat").astype(np.float32) x_data = [np.random.rand(100).astype(np.float32),np.random.rand(100).astype(np.float32)+10]
x_data=mat(x_data).T
y_data = 5.3+np.random.rand(100) # We evaluate the x and y by sklearn to get a sense of the coefficients.
reg = linear_model.LinearRegression()
reg.fit(x_data, y_data)
print ("Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_)) # Now we use tensorflow to get similar results. # Before we put the x_data into tensorflow, we need to standardize it
# in order to achieve better performance in gradient descent;
# If not standardized, the convergency speed could not be tolearated.
# Reason: If a feature has a variance that is orders of magnitude larger than others,
# it might dominate the objective function
# and make the estimator unable to learn from other features correctly as expected.
# 对于梯度下降算法,变量是否标准化很重要。在这个例子中,变量一个是面积,一个是房间数,量级相差很大,如果不归一化,面积在目标函数和梯度中就会占据主导地位,导致收敛极慢。
scaler = preprocessing.StandardScaler().fit(x_data)
print (scaler.mean_, scaler.scale_)
x_data_standard = scaler.transform(x_data) W = tf.Variable(tf.zeros([2, 1]))
b = tf.Variable(tf.zeros([1, 1]))
y = tf.matmul(x_data_standard, W) + b loss = tf.reduce_mean(tf.square(y - y_data.reshape(-1, 1)))/2
optimizer = tf.train.GradientDescentOptimizer(0.3)
train = optimizer.minimize(loss) init = tf.initialize_all_variables() sess = tf.Session()
sess.run(init)
for step in range(100):
sess.run(train)
if step % 10 == 0:
print (step, sess.run(W).flatten(), sess.run(b).flatten()) print ("Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten()))
print ("Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W))))
  • 思路:对于梯度下降算法,变量是否标准化很重要。在这个例子中,变量一个是面积,一个是房间数,量级相差很大,如果不归一化,面积在目标函数和梯度中就会占据主导地位,导致收敛极慢。

逻辑回归

# -*- coding: utf-8 -*-
"""
Created on Wed Sep 6 20:13:15 2017
数据下载:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html @author: Administrator
""" import tensorflow as tf
import numpy as np
from numpy import mat
from sklearn.linear_model import LogisticRegression
from sklearn import preprocessing # Read x and y
x_data = np.loadtxt("ex4Data/ex4x.dat").astype(np.float32)
y_data = np.loadtxt("ex4Data/ex4y.dat").astype(np.float32) #x_data = [np.random.rand(100).astype(np.float32),np.random.rand(100).astype(np.float32)+10]
#x_data=mat(x_data).T
#y_data = 5.3+np.random.rand(100) scaler = preprocessing.StandardScaler().fit(x_data)
x_data_standard = scaler.transform(x_data) # We evaluate the x and y by sklearn to get a sense of the coefficients.
reg = LogisticRegression(C=999999999, solver="newton-cg") # Set C as a large positive number to minimize the regularization effect
reg.fit(x_data, y_data)
print ("Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_)) # Now we use tensorflow to get similar results.
W = tf.Variable(tf.zeros([2, 1]))
b = tf.Variable(tf.zeros([1, 1]))
y = 1 / (1 + tf.exp(-tf.matmul(x_data_standard, W) + b))
loss = tf.reduce_mean(- y_data.reshape(-1, 1) * tf.log(y) - (1 - y_data.reshape(-1, 1)) * tf.log(1 - y)) optimizer = tf.train.GradientDescentOptimizer(1.3)
train = optimizer.minimize(loss) init = tf.initialize_all_variables() sess = tf.Session()
sess.run(init)
for step in range(100):
sess.run(train)
if step % 10 == 0:
print (step, sess.run(W).flatten(), sess.run(b).flatten()) print ("Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten()))
print ("Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W)))) # Problem solved and we are happy. But...
# I'd like to implement the logistic regression from a multi-class viewpoint instead of binary.
# In machine learning domain, it is called softmax regression
# In economic and statistics domain, it is called multinomial logit (MNL) model, proposed by Daniel McFadden, who shared the 2000 Nobel Memorial Prize in Economic Sciences. print ("------------------------------------------------")
print ("We solve this binary classification problem again from the viewpoint of multinomial classification")
print ("------------------------------------------------") # As a tradition, sklearn first
reg = LogisticRegression(C=9999999999, solver="newton-cg", multi_class="multinomial")
reg.fit(x_data, y_data)
print ("Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_))
print ("A little bit difference at first glance. What about multiply them with 2?") # Then try tensorflow
W = tf.Variable(tf.zeros([2, 2])) # first 2 is feature number, second 2 is class number
b = tf.Variable(tf.zeros([1, 2]))
V = tf.matmul(x_data_standard, W) + b
y = tf.nn.softmax(V) # tensorflow provide a utility function to calculate the probability of observer n choose alternative i, you can replace it with `y = tf.exp(V) / tf.reduce_sum(tf.exp(V), keep_dims=True, reduction_indices=[1])` # Encode the y label in one-hot manner
lb = preprocessing.LabelBinarizer()
lb.fit(y_data)
y_data_trans = lb.transform(y_data)
y_data_trans = np.concatenate((1 - y_data_trans, y_data_trans), axis=1) # Only necessary for binary class loss = tf.reduce_mean(-tf.reduce_sum(y_data_trans * tf.log(y), reduction_indices=[1]))
optimizer = tf.train.GradientDescentOptimizer(1.3)
train = optimizer.minimize(loss) init = tf.initialize_all_variables() sess = tf.Session()
sess.run(init)
for step in range(100):
sess.run(train)
if step % 10 == 0:
print (step, sess.run(W).flatten(), sess.run(b).flatten()) print ("Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten()))
print ("Coefficients of tensorflow (raw input): K=%s, b=%s" % ((sess.run(W) / scaler.scale_).flatten(), sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W))))
  • 思考:
  • 对于逻辑回归,损失函数比线性回归模型复杂了一些。首先需要通过sigmoid函数,将线性回归的结果转化为0至1之间的概率值。然后写出每个样本的发生概率(似然),那么所有样本的发生概率就是每个样本发生概率的乘积。为了求导方便,我们对所有样本的发生概率取对数,保持其单调性的同时,可以将连乘变为求和(加法的求导公式比乘法的求导公式简单很多)。对数极大似然估计方法的目标函数是最大化所有样本的发生概率;机器学习习惯将目标函数称为损失,所以将损失定义为对数似然的相反数,以转化为极小值问题。
  • 我们提到逻辑回归时,一般指的是二分类问题;然而这套思想是可以很轻松就拓展为多分类问题的,在机器学习领域一般称为softmax回归模型。本文的作者是统计学与计量经济学背景,因此一般将其称为MNL模型。

Reference:

tensorflow基础练习:线性模型的更多相关文章

  1. TensorFlow基础

    TensorFlow基础 SkySeraph  2017 Email:skyseraph00#163.com 更多精彩请直接访问SkySeraph个人站点:www.skyseraph.com Over ...

  2. TensorFlow基础笔记(0) 参考资源学习文档

    1 官方文档 https://www.tensorflow.org/api_docs/ 2 极客学院中文文档 http://www.tensorfly.cn/tfdoc/api_docs/python ...

  3. TensorFlow基础笔记(3) cifar10 分类学习

    TensorFlow基础笔记(3) cifar10 分类学习 CIFAR-10 is a common benchmark in machine learning for image recognit ...

  4. TensorFlow基础剖析

    TensorFlow基础剖析 一.概述 TensorFlow 是一个使用数据流图 (Dataflow Graph) 表达数值计算的开源软件库.它使 用节点表示抽象的数学计算,并使用 OP 表达计算的逻 ...

  5. 芝麻HTTP:TensorFlow基础入门

    本篇内容基于 Python3 TensorFlow 1.4 版本. 本节内容 本节通过最简单的示例 -- 平面拟合来说明 TensorFlow 的基本用法. 构造数据 TensorFlow 的引入方式 ...

  6. 05基于python玩转人工智能最火框架之TensorFlow基础知识

    从helloworld开始 mkdir mooc # 新建一个mooc文件夹 cd mooc mkdir 1.helloworld # 新建一个helloworld文件夹 cd 1.helloworl ...

  7. tensorflow基础篇-1

    1.使用占位符和变量 import tensorflow as tf import numpy as np #-----创建变量并初始化----------- def first(): my_var= ...

  8. 5、Tensorflow基础(三)神经元函数及优化方法

    1.激活函数 激活函数(activation function)运行时激活神经网络中某一部分神经元,将激活信息向后传入下一层的神经网络.神经网络之所以能解决非线性问题(如语音.图像识别),本质上就是激 ...

  9. TensorFlow应用实战 | TensorFlow基础知识

    挺长的~超出估计值了~预计阅读时间20分钟. 从helloworld开始 mkdir 1.helloworld cd 1.helloworldvim helloworld.py 代码: # -*- c ...

随机推荐

  1. [LUOGU] 1090 合并果子

    题目描述 在一个果园里,多多已经将所有的果子打了下来,而且按果子的不同种类分成了不同的堆.多多决定把所有的果子合成一堆. 每一次合并,多多可以把两堆果子合并到一起,消耗的体力等于两堆果子的重量之和.可 ...

  2. windows 7虚拟机与主机不能互ping通,但是都能与网关ping通

    这里是在Windows 10的环境下使用VMware安装了一个Windows 7的虚拟机,虚拟机中是使用桥接的方式.结果发现虚拟机不能与物理机互通,但是却能与网关互通.查看虚拟机和物理机的IP发现都是 ...

  3. Linux基础学习-使用Squid部署代理缓存服务

    使用Squid部署代理缓存服务 Squid是Linux系统中最为流行的一款高性能代理服务软件,通常作为Web网站的前置缓存服务,能够代替用户向网站服务器请求页面数据并进行缓存.Squid服务配置简单. ...

  4. hosts设置本地虚拟域名

    C:\Windows\System32\drivers\etc hosts 需要用管理员运行

  5. laravel模型关联与列表展示

    上面这个是一个模型关联的图,其实我们很容易去理解 比如说,一对一,也就是说一个用户对应的是一个手机号. 一对多,比如说一篇文章可以有多条评论 一对多反向:如一篇文章可以有多条评论,但对应每条评论也只针 ...

  6. Verilog学习笔记基本语法篇(五)········ 条件语句

    条件语句可以分为if_else语句和case语句两张部分. A)if_else语句 三种表达形式 1) if(表达式)          2)if(表达式)               3)if(表达 ...

  7. 每周一题 3n+1问题

    3n+1问题 #include<iostream> #include<math.h> #include<map> using namespace std; map& ...

  8. i++和++i的区别,及其线程安全问题

    i++和++i都是i=i+1的意思,但是过程有些许区别: i++:先赋值再自加.(例如:i=1:a=1+i++:结果为a=1+1=2,语句执行完后i再进行自加为2) ++i:先自加再赋值.(例如:i= ...

  9. hdu2024

    这题目感觉不是很严谨,如果是关键字的话也是不能作为合法标识符的,但是这个不用检测,就算要检测也会很费劲,还得用字符串匹配,而且还得知道一共都有哪些关键字,太麻烦了,所以出题人原意就是检查大小写字母数字 ...

  10. 九度oj 题目1160:放苹果

    题目描述: 把M个同样的苹果放在N个同样的盘子里,允许有的盘子空着不放,问共有多少种不同的分法?(用K表示)5,1,1和1,5,1 是同一种分法. 输入: 第一行是测试数据的数目t(0 <= t ...