[深度学习]Python/Theano实现逻辑回归网络的代码分析

2014-07-21 10:28:34

首先PO上主要Python代码(2.7), 这个代码在Deep Learning上可以找到.

 　　 # allocate symbolic variables for the data

     index = T.lscalar()  # index to a [mini]batch

     x = T.matrix('x')  # the data is presented as rasterized images

     y = T.ivector('y')  # the labels are presented as 1D vector of

                            # [int] labels

     # construct the logistic regression class

     # Each MNIST image has size 28*28

     classifier = LogisticRegression(input=x, n_in=24 * 48, n_out=10)

     # the cost we minimize during training is the negative log likelihood of

     # the model in symbolic format

     cost = classifier.negative_log_likelihood(y)

     # compiling a Theano function that computes the mistakes that are made by

     # the model on a minibatch

     test_model = theano.function(inputs=[index],

             outputs=classifier.errors(y),

             givens={

                 x: test_set_x[index * batch_size: (index + 1) * batch_size],

                 y: test_set_y[index * batch_size: (index + 1) * batch_size]})

     validate_model = theano.function(inputs=[index],

             outputs=classifier.errors(y),

             givens={

                 x: valid_set_x[index * batch_size:(index + 1) * batch_size],

                 y: valid_set_y[index * batch_size:(index + 1) * batch_size]})

     # compute the gradient of cost with respect to theta = (W,b)

     g_W = T.grad(cost=cost, wrt=classifier.W)

     g_b = T.grad(cost=cost, wrt=classifier.b)

     # specify how to update the parameters of the model as a list of

     # (variable, update expression) pairs.

     updates = [(classifier.W, classifier.W - learning_rate * g_W),

                (classifier.b, classifier.b - learning_rate * g_b)]

     # compiling a Theano function `train_model` that returns the cost, but in

     # the same time updates the parameter of the model based on the rules

     # defined in `updates`

     train_model = theano.function(inputs=[index],

             outputs=cost,

             updates=updates,

             givens={

                 x: train_set_x[index * batch_size:(index + 1) * batch_size],

                 y: train_set_y[index * batch_size:(index + 1) * batch_size]})

代码长度不算太长, 只是逻辑关系需要厘清. 下面逐行分析这些代码.

代码中的T是theano.tensor的代名词.

行1~行13:

# allocate symbolic variables for the data

    index = T.lscalar()  # index to a [mini]batch

    x = T.matrix('x')  # the data is presented as rasterized images

    y = T.ivector('y')  # the labels are presented as 1D vector of

                           # [int] labels

    # construct the logistic regression class

    # Each MNIST image has size 28*28

    classifier = LogisticRegression(input=x, n_in=24 * 48, n_out=10)

    # the cost we minimize during training is the negative log likelihood of

    # the model in symbolic format

    cost = classifier.negative_log_likelihood(y)

声明index, x, y三个符号变量(类似Matlab的symbol), 分别用来指代训练样本批序号, 输入图像矩阵, 期望输出向量.

classifier是一个LR对象, 调用LR类的构造函数, 并将符号变量x作为输入, 我们就可以使用Theano.function方法在x和classifier中构造联系, 当x改变时, classifier也会改变.

cost指代classifier中的负对数相似度, 使用符号变量y作为输入, 此处的作用和classifier相同, 不再赘述.

行14~行28:

    # compiling a Theano function that computes the mistakes that are made by

    # the model on a minibatch

    test_model = theano.function(inputs=[index],

            outputs=classifier.errors(y),

            givens={

                x: test_set_x[index * batch_size: (index + 1) * batch_size],

                y: test_set_y[index * batch_size: (index + 1) * batch_size]})

    validate_model = theano.function(inputs=[index],

            outputs=classifier.errors(y),

            givens={

                x: valid_set_x[index * batch_size:(index + 1) * batch_size],

                y: valid_set_y[index * batch_size:(index + 1) * batch_size]})

这里的2个model是容易让人迷惑的地方, 关于theano.function, 需要一些基础知识:

比如声明2个符号变量a, b: a, b = T.iscalar(), T.iscalar() , 它们都是整形(i)标量(scalar), 再声明一个变量c: c = a + b , 我们通过type(c)来查看其类型:

>>> type(c)

<class 'theano.tensor.var.TensorVariable'>

>>> type(a)

<class 'theano.tensor.var.TensorVariable'>

　　c的类型和a, b相同, 都是Tensor变量. 至此准备工作完成, 我们通过theano.function来构建关系: add = theano.function(inputs = [a, b], output = c) . 这条语句就构造了一个函数add, 它接收a, b为输入, 输出为c. 我们在Python中这样使用它即可:

>>> add = theano.function(inputs = [a, b], outputs = c)

>>> test = add(100, 100)

>>> test

array(200)

好了, 有了基础知识, 就可以理解这2个model的含义:

test_model = theano.function(inputs=[index],

            outputs=classifier.errors(y),

            givens={

                x: test_set_x[index * batch_size: (index + 1) * batch_size],

                y: test_set_y[index * batch_size: (index + 1) * batch_size]})

输入是index, 输出则是classifier对象中的errors方法的返回值, 其中y作为errors方法的输入参数. 其中的classifier接收x作为输入参数.

givens关键字的作用是使用冒号后面的变量来替代冒号前面的变量, 本例中, 即使用测试数据中的第index批数据(一批有batch_size个)来替换x和y.

test_model用中文来解释就是: 接收第index批测试数据的图像数据x和期望输出y作为输入, 返回误差值的函数.

validate_model = theano.function(inputs=[index],

            outputs=classifier.errors(y),

            givens={

                x: valid_set_x[index * batch_size:(index + 1) * batch_size],

                y: valid_set_y[index * batch_size:(index + 1) * batch_size]})

这里同上, 只不过使用的是验证数据.

行29~行32:

    # compute the gradient of cost with respect to theta = (W,b)

    g_W = T.grad(cost=cost, wrt=classifier.W)

    g_b = T.grad(cost=cost, wrt=classifier.b)

计算的是梯度, 用于学习算法, T.grad(y, x) 计算的是相对于x的y的梯度.

行33~行37:

    # specify how to update the parameters of the model as a list of

    # (variable, update expression) pairs.

    updates = [(classifier.W, classifier.W - learning_rate * g_W),

               (classifier.b, classifier.b - learning_rate * g_b)]

updates是一个长度为2的list, 每个元素都是一组tuple, 在theano.function中, 每次调用对应函数, 使用tuple中的第二个元素来更新第一个元素.

行38~行46:

　　# compiling a Theano function `train_model` that returns the cost, but in

    # the same time updates the parameter of the model based on the rules

    # defined in `updates`

    train_model = theano.function(inputs=[index],

            outputs=cost,

            updates=updates,

            givens={

                x: train_set_x[index * batch_size:(index + 1) * batch_size],

                y: train_set_y[index * batch_size:(index + 1) * batch_size]})

这里其余部分不再赘述. 需要注意的是增加了一个updates参数, 这个参数给定了每次调用train_model时对某些参数的修改(W, b). 另外输出也变成了cost函数(对数误差)而非test_model和valid-model中的errors函数(绝对误差).

[深度学习]Python/Theano实现逻辑回归网络的代码分析的更多相关文章

吴恩达深度学习：2.9逻辑回归梯度下降法(Logistic Regression Gradient descent)
1.回顾logistic回归,下式中a是逻辑回归的输出,y是样本的真值标签值 . (1)现在写出该样本的偏导数流程图.假设这个样本只有两个特征x1和x2, 为了计算z,我们需要输入参数w1.w2和b还 ...
[源码解析] 深度学习分布式训练框架 horovod (4) --- 网络基础 & Driver
[源码解析] 深度学习分布式训练框架 horovod (4) --- 网络基础 & Driver 目录 [源码解析] 深度学习分布式训练框架 horovod (4) --- 网络基础 & ...
深度学习python的配置（Windows）
Windows下深度学习python的配置 1.安装包的下载 (1)anaconda (2)pycharm 2.安装教程 (1)anaconda a.降版本 b.换源 (2)pycharm a.修改h ...
Python实现LR(逻辑回归)
Python实现LR(逻辑回归) 运行环境 Pyhton3 numpy(科学计算包) matplotlib(画图所需,不画图可不必) 计算过程 st=>start: 开始 e=>end o ...
（数据科学学习手札24）逻辑回归分类器原理详解&Python与R实现
一.简介逻辑回归(Logistic Regression),与它的名字恰恰相反,它是一个分类器而非回归方法,在一些文献里它也被称为logit回归.最大熵分类器(MaxEnt).对数线性分类器等:我们 ...
Python机器学习算法 — 逻辑回归（Logistic Regression）
逻辑回归--简介逻辑回归(Logistic Regression)就是这样的一个过程:面对一个回归或者分类问题,建立代价函数,然后通过优化方法迭代求解出最优的模型参数,然后测试验证我们这个求解的模型 ...
python sklearn库实现逻辑回归的实例代码
Sklearn简介 Scikit-learn(sklearn)是机器学习中常用的第三方模块,对常用的机器学习方法进行了封装,包括回归(Regression).降维(Dimensionality Red ...
SparkMLlib学习分类算法之逻辑回归算法
SparkMLlib学习分类算法之逻辑回归算法 (一),逻辑回归算法的概念(参考网址:http://blog.csdn.net/sinat_33761963/article/details/51693 ...
深度学习之卷积神经网络(CNN)详解与代码实现（一）
卷积神经网络(CNN)详解与代码实现本文系作者原创,转载请注明出处:https://www.cnblogs.com/further-further-further/p/10430073.html 目 ...

随机推荐

Android-------ListView列表中获取EditText输入的值
最近项目的购物车中用列表中包含了留言功能, 需要获取EditText输入的内容,当购买多件商品时,就有点棘手了. 经过查资料解决了这个功能,并写了一个案例: 效果图: 可以在商品数据用一个字段来管理留 ...
Codeforces Round #400
最近好像总是有点不想打,专题也刷不动,还是坚持这做了一场,虽然打到一半就没打了...(反正通常都只能做出两题) 感觉自己切水题越来越熟练了,然而难题还是不会做.. A题,水,用vector存下来就行了 ...
CodeForces - 91B单调队列
有一个数列,对于每一个数,求比它小的在他右边距离他最远的那个数和他的距离用单调队列做,维护单调队列时可采用如下方法,对于每一个数,如果队列中没有数,则加入队列,如果队列头的数比当前数大,则舍弃该数 ...
day6-面向对象基础篇
一.面向对象引子及概念结合编程的一些理论知识和实践,可以总结出目前存在以下编程模式: 1. 面向过程按照业务逻辑和实现过程步骤来逐步垒代码,代码编写的逻辑即对应于实际实现的步骤过程,核心是过程两个 ...
【codeforces-482div2-C】Kuro and Walking Route(DFS)
题目链接:http://codeforces.com/contest/979/problem/C Kuro is living in a country called Uberland, consis ...
DNS智能解析的搭建与配置
分类: LINUX 原文地址:DNS智能解析的搭建与配置作者:十年梦生 9月份整整忙了一个月,都抽不出时间来写篇文章,这几天趁着10.1终于有时间来写些东西了,将9月份所做的一些东西来做下总结. ...
【html】META http-equiv 大全
meta是html语言head区的一个辅助性标签.几乎所有的网页里,我们可以看到类似下面这段的html代码: <head><meta http-equiv="content ...
Spring的AOP介绍
AOP:(Aspect-Orlented-Programming)面向切面编程,和面向对象是互相补充的.面向对象是横着编程,面向切面则是竖着编程. 1 2 3 4 @Before("exec ...
-Linux下的虚拟机安装与管理
一.虚拟机安装首先安转之前,要提前下载一个镜像,这里是:rhel-server-7.0-x86_64-dvd.iso 1)图形化方法 [1]在本机打开终端,切换到超级用户下.输入命令:virt-ma ...
【dlbook】实践方法论
[性能度量] 使用什么误差度量? 目标性能大致为多少? [默认的基准模型] 首先尝试分段线性单元,ReLU以及扩展. SGD一般是合理的选择,选加入动量的版本,衰减方法不一. 批标准化在优化出现问题时 ...

[深度学习]Python/Theano实现逻辑回归网络的代码分析

[深度学习]Python/Theano实现逻辑回归网络的代码分析的更多相关文章

随机推荐

热门专题