基于MNIST数据的softmax regression

跟着tensorflow上mnist基本机器学习教程联系
首先了解sklearn接口: sklearn.linear_model.LogisticRegression

  In the multiclass case, the training algorithm uses the one-vs-rest (OvR)

    scheme if the 'multi_class' option is set to 'ovr', and uses the cross-

    entropy loss if the 'multi_class' option is set to 'multinomial'.

    (Currently the 'multinomial' option is supported only by the 'lbfgs',

    'sag' and 'newton-cg' solvers.)

    This class implements regularized logistic regression using the

    'liblinear' library, 'newton-cg', 'sag' and 'lbfgs' solvers. It can handle

    both dense and sparse input. Use C-ordered arrays or CSR matrices

    containing 64-bit floats for optimal performance; any other input format

    will be converted (and copied).

    The 'newton-cg', 'sag', and 'lbfgs' solvers support only L2 regularization

    with primal formulation. The 'liblinear' solver supports both L1 and L2

    regularization, with a dual formulation only for the L2 penalty.

    Read more in the :ref:`User Guide <logistic_regression>`.

    Parameters

    ----------

    penalty : str, 'l1' or 'l2', default: 'l2'

        Used to specify the norm used in the penalization. The 'newton-cg',

        'sag' and 'lbfgs' solvers support only l2 penalties.

        .. versionadded:: 0.19

           l1 penalty with SAGA solver (allowing 'multinomial' + L1)

    dual : bool, default: False

        Dual or primal formulation. Dual formulation is only implemented for

        l2 penalty with liblinear solver. Prefer dual=False when

        n_samples > n_features.

    tol : float, default: 1e-4

        Tolerance for stopping criteria.

    C : float, default: 1.0

        Inverse of regularization strength; must be a positive float.

        Like in support vector machines, smaller values specify stronger

        regularization.

    fit_intercept : bool, default: True

        Specifies if a constant (a.k.a. bias or intercept) should be

        added to the decision function.

    solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'},

        default: 'liblinear'

        Algorithm to use in the optimization problem.

        - For small datasets, 'liblinear' is a good choice, whereas 'sag' and

            'saga' are faster for large ones.

        - For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs'

            handle multinomial loss; 'liblinear' is limited to one-versus-rest

            schemes.

        - 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty, whereas

            'liblinear' and 'saga' handle L1 penalty.

        Note that 'sag' and 'saga' fast convergence is only guaranteed on

        features with approximately the same scale. You can

        preprocess the data with a scaler from sklearn.preprocessing.

        .. versionadded:: 0.17

           Stochastic Average Gradient descent solver.

        .. versionadded:: 0.19

           SAGA solver.

    multi_class : str, {'ovr', 'multinomial'}, default: 'ovr'

        Multiclass option can be either 'ovr' or 'multinomial'. If the option

        chosen is 'ovr', then a binary problem is fit for each label. Else

        the loss minimised is the multinomial loss fit across

        the entire probability distribution. Does not work for liblinear

        solver.

        .. versionadded:: 0.18

           Stochastic Average Gradient descent solver for 'multinomial' case.

    Attributes

    coef_ : array, shape (1, n_features) or (n_classes, n_features)

        Coefficient of the features in the decision function.

        `coef_` is of shape (1, n_features) when the given problem

        is binary.

    intercept_ : array, shape (1,) or (n_classes,)

        Intercept (a.k.a. bias) added to the decision function.

        If `fit_intercept` is set to False, the intercept is set to zero.

        `intercept_` is of shape(1,) when the problem is binary.

    n_iter_ : array, shape (n_classes,) or (1, )

        Actual number of iterations for all classes. If binary or multinomial,

        it returns only 1 element. For liblinear solver, only the maximum

        number of iteration across all classes is given.

基于Softmax的mnist回归

# -*- coding: utf-8 -*-

"""

Created on Thu Sep  7 10:47:18 2017

@author: Administrator

"""

import gzip

import struct

import numpy as np

from sklearn.linear_model import LogisticRegression

from sklearn import preprocessing

from sklearn.metrics import accuracy_score

import tensorflow as tf

# MNIST data is stored in binary format,

# and we transform them into numpy ndarray objects by the following two utility functions

def read_image(file_name):

    with gzip.open(file_name, 'rb') as f:

        buf = f.read()

        index = 0

        magic, images, rows, columns = struct.unpack_from('>IIII' , buf , index)

        index += struct.calcsize('>IIII')

        image_size = '>' + str(images*rows*columns) + 'B'

        ims = struct.unpack_from(image_size, buf, index)

        im_array = np.array(ims).reshape(images, rows, columns)

        return im_array

def read_label(file_name):

    with gzip.open(file_name, 'rb') as f:

        buf = f.read()

        index = 0

        magic, labels = struct.unpack_from('>II', buf, index)

        index += struct.calcsize('>II')

        label_size = '>' + str(labels) + 'B'

        labels = struct.unpack_from(label_size, buf, index)

        label_array = np.array(labels)

        return label_array

print ("Start processing MNIST handwritten digits data...")

train_x_data = read_image("MNIST_data/train-images-idx3-ubyte.gz")

train_x_data = train_x_data.reshape(train_x_data.shape[0], -1).astype(np.float32)

train_y_data = read_label("MNIST_data/train-labels-idx1-ubyte.gz")

test_x_data = read_image("MNIST_data/t10k-images-idx3-ubyte.gz")

test_x_data = test_x_data.reshape(test_x_data.shape[0], -1).astype(np.float32)

test_y_data = read_label("MNIST_data/t10k-labels-idx1-ubyte.gz")

train_x_minmax = train_x_data / 255.0

test_x_minmax = test_x_data / 255.0

# Of course you can also use the utility function to read in MNIST provided by tensorflow

# from tensorflow.examples.tutorials.mnist import input_data

# mnist = input_data.read_data_sets("MNIST_data/", one_hot=False)

# train_x_minmax = mnist.train.images

# train_y_data = mnist.train.labels

# test_x_minmax = mnist.test.images

# test_y_data = mnist.test.labels

# We evaluate the softmax regression model by sklearn first

eval_sklearn = False

if eval_sklearn:

    print ("Start evaluating softmax regression model by sklearn...")

    reg = LogisticRegression(solver="lbfgs", multi_class="multinomial")

    reg.fit(train_x_minmax, train_y_data)

    np.savetxt('coef_softmax_sklearn.txt', reg.coef_, fmt='%.6f')  # Save coefficients to a text file

    test_y_predict = reg.predict(test_x_minmax)

    print ("Accuracy of test set: %f" % accuracy_score(test_y_data, test_y_predict))

eval_tensorflow = True

batch_gradient = False

if eval_tensorflow:

    print ("Start evaluating softmax regression model by tensorflow...")

    # reformat y into one-hot encoding style

    lb = preprocessing.LabelBinarizer()

    lb.fit(train_y_data)

    train_y_data_trans = lb.transform(train_y_data)

    test_y_data_trans = lb.transform(test_y_data)

    x = tf.placeholder(tf.float32, [None, 784])

    W = tf.Variable(tf.zeros([784, 10]))

    b = tf.Variable(tf.zeros([10]))

    V = tf.matmul(x, W) + b

    y = tf.nn.softmax(V)

    y_ = tf.placeholder(tf.float32, [None, 10])

    loss = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

    optimizer = tf.train.GradientDescentOptimizer(0.5)

    train = optimizer.minimize(loss)

    init = tf.initialize_all_variables()

    sess = tf.Session()

    sess.run(init)

    if batch_gradient:

        for step in range(300):

            sess.run(train, feed_dict={x: train_x_minmax, y_: train_y_data_trans})

            if step % 10 == 0:

                print ("Batch Gradient Descent processing step %d" % step)

        print ("Finally we got the estimated results, take such a long time...")

    else:

        for step in range(1000):

            sample_index = np.random.choice(train_x_minmax.shape[0], 100)

            batch_xs = train_x_minmax[sample_index, :]

            batch_ys = train_y_data_trans[sample_index, :]

            sess.run(train, feed_dict={x: batch_xs, y_: batch_ys})

            if step % 100 == 0:

                print ("Stochastic Gradient Descent processing step %d" % step)

    np.savetxt('coef_softmax_tf.txt', np.transpose(sess.run(W)), fmt='%.6f')  # Save coefficients to a text file

    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))

    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    print ("Accuracy of test set: %f" % sess.run(accuracy, feed_dict={x: test_x_minmax, y_: test_y_data_trans}))

注意：
A Variable is a modifiable tensor that lives in TensorFlow's graph of interacting operations. It can be used and even modified by the computation. For machine learning applications, one generally has the model parameters be Variables.
不过从测试集的准确率来看，二者都在92%左右，sklearn稍微好一点。注意，92%的准确率看起来不错，但其实是一个很低的准确率，按照官网教程的说法，应该要感到羞愧。
sklearn的估计时间有点长，因为每一轮参数更新都是基于全量的训练集数据算出损失，再算出梯度，然后再改进结果的。
tensorflow采用batch gradient descent估计算法时，时间也比较长，原因同上。
tensorflow采用stochastic gradient descent估计算法时间短，最后的估计结果也挺好，相当于每轮迭代只用到了部分数据集算出损失和梯度，速度变快，但可能bias增加；所以把迭代次数增多，这样可以降低variance，总体上的误差相比batch gradient descent并没有差多少。

官网demo

自动下载数据



# Copyright 2015 The TensorFlow Authors. All Rights Reserved.

#

# Licensed under the Apache License, Version 2.0 (the "License");

# you may not use this file except in compliance with the License.

# You may obtain a copy of the License at

#

#     http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

# ==============================================================================

"""A very simple MNIST classifier.

See extensive documentation at

https://www.tensorflow.org/get_started/mnist/beginners

"""

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

import argparse

import sys

from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

FLAGS = None

def main(_):

  # Import data

  mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)

  # Create the model

  x = tf.placeholder(tf.float32, [None, 784])

  W = tf.Variable(tf.zeros([784, 10]))

  b = tf.Variable(tf.zeros([10]))

  y = tf.matmul(x, W) + b

  # Define loss and optimizer

  y_ = tf.placeholder(tf.float32, [None, 10])

  # The raw formulation of cross-entropy,

  #

  #   tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)),

  #                                 reduction_indices=[1]))

  #

  # can be numerically unstable.

  #

  # So here we use tf.nn.softmax_cross_entropy_with_logits on the raw

  # outputs of 'y', and then average across the batch.

  cross_entropy = tf.reduce_mean(

      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

  train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

  sess = tf.InteractiveSession()

  tf.global_variables_initializer().run()

  # Train

  # 该循环的每个步骤中，我们都会随机抓取训练数据中的100个批处理数据点，然后我们用这些数据点作为参数替换之前的占位符来运行train_step。

  # 使用一小部分的随机数据来进行训练被称为随机训练（stochastic

  # training）- 在这里更确切的说是随机梯度下降训练。在理想情况下，我们希望用我们所有的数据来进行每一步的训练，因为这能给我们更好的训练结果，但显然这需要很大的计算开销。

  # 所以，每一次训练我们可以使用不同的数据子集，这样做既可以减少计算开销，又可以最大化地学习到数据集的总体特性。

  for _ in range(1000):

    batch_xs, batch_ys = mnist.train.next_batch(100) ##

    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

  # Test trained model

  correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))

  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

  print(sess.run(accuracy, feed_dict={x: mnist.test.images,

                                      y_: mnist.test.labels}))

if __name__ == '__main__':

  parser = argparse.ArgumentParser()

  parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data',

                      help='Directory for storing input data')

  FLAGS, unparsed = parser.parse_known_args()

  tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

result

Start processing MNIST handwritten digits data...

Start evaluating softmax regression model by tensorflow...

WARNING:tensorflow:From D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\util\tf_should_use.py:175: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.

Instructions for updating:

Use `tf.global_variables_initializer` instead.

2017-09-08 16:47:36.504803: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.

2017-09-08 16:47:36.504803: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.

Stochastic Gradient Descent processing step 0

Stochastic Gradient Descent processing step 100

Stochastic Gradient Descent processing step 200

Stochastic Gradient Descent processing step 300

Stochastic Gradient Descent processing step 400

Stochastic Gradient Descent processing step 500

Stochastic Gradient Descent processing step 600

Stochastic Gradient Descent processing step 700

Stochastic Gradient Descent processing step 800

Stochastic Gradient Descent processing step 900

Accuracy of test set: 0.915600

基于MNIST数据的softmax regression的更多相关文章

基于MNIST数据的卷积神经网络CNN
基于tensorflow使用CNN识别MNIST 参数数量:第一个卷积层5x5x1x32=800个参数,第二个卷积层5x5x32x64=51200个参数,第三个全连接层7x7x64x1024=3211 ...
R︱Softmax Regression建模 (MNIST 手写体识别和文档多分类应用)
本文转载自经管之家论坛, R语言中的Softmax Regression建模 (MNIST 手写体识别和文档多分类应用) R中的softmaxreg包,发自2016-09-09,链接:https:// ...
【TensorFlow-windows】(一)实现Softmax Regression进行手写数字识别（mnist）
博文主要内容有: 1.softmax regression的TensorFlow实现代码(教科书级的代码注释) 2.该实现中的函数总结平台: 1.windows 10 64位 2.Anaconda3 ...
Tensorflow - Implement for a Softmax Regression Model on MNIST.
Coding according to TensorFlow 官方文档中文版 import tensorflow as tf from tensorflow.examples.tutorials.mn ...
Deep Learning Tutorial - Classifying MNIST digits using Logistic Regression
Deep Learning Tutorial 由 Montreal大学的LISA实验室所作,基于Theano的深度学习材料.Theano是一个python库,使得写深度模型更容易些,也可以在GPU上训 ...
Tensorflow之基于MNIST手写识别的入门介绍
Tensorflow是当下AI热潮下,最为受欢迎的开源框架.无论是从Github上的fork数量还是star数量,还是从支持的语音,开发资料,社区活跃度等多方面,他当之为superstar. 在前面介 ...
（六）6.10 Neurons Networks implements of softmax regression
softmax可以看做只有输入和输出的Neurons Networks,如下图: 其参数数量为k*(n+1) ,但在本实现中没有加入截距项,所以参数为k*n的矩阵. 对损失函数J(θ)的形式有: 算法 ...
UFLDL实验报告1： Softmax Regression
PS:这些是今年4月份,跟斯坦福UFLDL教程时的实验报告,当时就应该好好整理的…留到现在好凌乱了 Softmax Regression实验报告 1.Softmax Regression实验描述 So ...
学习笔记TF024:TensorFlow实现Softmax Regression(回归)识别手写数字
TensorFlow实现Softmax Regression(回归)识别手写数字.MNIST(Mixed National Institute of Standards and Technology ...

随机推荐

perl学习之：localtime
Perl中localtime()函数以及sprintf (2011-4-25 19:39)localtime函数 localtime函数,根据它所在的上下文,可以用两种完全不同的方法来运行.在标量上下 ...
**没有规则可以创建“XXX”需要的目标“XXX”问题的解决方案
一.现象我将之前Redhat9.0编译好的uboot,转到ubuntu12.04环境.在ubuntu环境下对 uboot重新编译提示错误.编译过程如下: root@hailin-virtual-ma ...
int main(int argc,char *argv[])的具体含义
int main(int argc,char * argv[]) argv为指针的指针 argc为整数 char **argv or: char *argv[] or: char argv[][] m ...
Uiautomator简介及其环境搭建、测试执行
UiAutomator框架使用指南 UiAutomator是Google开发的自动化测试工具,通过UI创建自动化测试代码,来测试界面(UI)的有效功能,可以针对应用程序运行在一个或更多的设备上.我们并 ...
django的rest framework框架——版本、解析器、序列化
一.rest framework的版本使用 1.版本可以写在URL中,通过GET传参,如 http://127.0.0.1:8082/api/users/?version=v1 (1)自定义类获取版本 ...
java紧耦合与松耦合关系
请先看下这个关于松耦合的回答举个简单的例子啦有一百人分成10个团队做开发你写了一个类A,供其他人调用,怎么办? 简单的方法就是把这个类打成jar包,然后给他们他们就A a = new A(); ...
Android单个按钮自定义Dialog
代码改变世界 Android单个按钮自定义Dialog dialog_layout.xml <?xml version="1.0" encoding="utf-8& ...
[luoguP3159] [CQOI2012]交换棋子（最小费用最大流）
传送门好难的网络流啊,建图真的超难. 如果不告诉我是网络流的话,我估计就会写dfs了. 使用费用流解决本题,设点 $p[i][j]$ 的参与交换的次数上限为 $v[i][j]$ ,以下为建图方式: ...
[luoguP3668] [USACO17OPEN]Modern Art 2 现代艺术2（栈）
传送门还是一个字——栈然后加一大堆特判至少我是这么做的我的代码 #include <cstdio> #include <iostream> #define N 1000 ...
【loj6191】「美团 CodeM 复赛」配对游戏
题目显然期望dp. 简单想法: f[i][j]表示前i个人中向右看并且没有被消除的人数的概率如果第i+1个人是向右,$f[i+1][j+1]=f[i][j]/2$ 如果第i+1个人是向左,$f[i ...

基于MNIST数据的softmax regression

基于Softmax的mnist回归

官网demo

基于MNIST数据的softmax regression的更多相关文章

随机推荐

热门专题