用TensorFlow教你手写字识别

如需转载，请备注出处及链接，谢谢。

2012 年，Alex Krizhevsky, Geoff Hinton, and Ilya Sutskever 赢得 ImageNet 挑战赛冠军，基于CNN的图像识别开始受到普遍关注，CNN 成为了图像分类的黄金标准，自那以后，科学界掀开了基于深度神经网络对图像识别的大探索，现如今，深度学习对图像的识别能力已经超出了人眼的辨别能力。本公众号的图像识别系列将循序渐进，层层深入的带领读者去学习图像识别，本篇中笔者将带领读者一块完成基于CNN的手写数字图像识别。

工具要求

工具及环境要求如下，如果大家在安装TensorFlow过程遇到问题，可以咨询笔者一起探讨。

Python 2.7.14
TensorFlow 1.5
pip 10.0.1
linux环境

MNIST数据集

基于MNIST数据集实现手写字识别可谓是深度学习经典入门必会的技能，该数据集由60000张训练图片和10000张测试图片组成，每张均为28*28像素的黑白图片。关于数据集的获取，大家可以直接登录到官网（http://yann.lecun.com/exdb/mnist/）下载图1中4个压缩文件，或者关注本公众号后台回复“mnist数据集”获取，这里需要注意的是，下载后的数据集为二进制的。

图1 MNIST数据集

备注：笔者在网上看了不少相关文章，都提到要先用input_data.py代码对数据集进行转换，其实不必再单独找到源代码进行处理，直接按照文中代码即可进行处理测试。当然，如果大家实在按捺不住内心的小宇宙，可以粘贴“input_data.py”到公众号后台进行留言获取。

模型训练

CNN网络的训练代码如下，通过运行代码，（1）自动的完成训练数据集的下载，数据下载至代码所在目录的MNIST_data文件夹下。（2）自动保存训练模型，将模型保存在./ckpt_dir文件夹下。（3）自动在上次训练的基础上进行模型的训练。注意：脚本需要传入一个参数作为CNN网络的训练次数，当传入的参数为0是，默认训练1000次。笔者在做的时候训练了16000次，其实笔者在训练12560次的时候测试了一下识别效果，表现已经很不错了，如果大家想进一步提高准确率建议进一步提高训练次数或者丰富数据集等。不多说了，直接上代码：

# -*- coding: utf-8 -*-
import tensorflow as tf
import tensorflow.examples.tutorials.mnist.input_data as input_data
import os

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])

W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

#权重初始化
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

#第一层卷积
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1,28,28,1])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

#d第二层卷积
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#全连接层
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#输出层
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

#训练和评估模型
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdagradOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

ckpt_dir = "./ckpt_dir"
if not os.path.exists(ckpt_dir):
    os.makedirs(ckpt_dir)
#标志变量不参与到训练中
global_step = tf.Variable(0, name='global_step', trainable=False)
saver = tf.train.Saver()

if int(sys.argv[1])>0:
    end=int(sys.argv[1])
else:
    end=10000

with tf.Session() as sess:
    ckpt = tf.train.get_checkpoint_state(ckpt_dir)
    if ckpt and ckpt.model_checkpoint_path:
        print(ckpt.model_checkpoint_path)
        saver.restore(sess, ckpt.model_checkpoint_path) # restore all variables
    else:
        tf.global_variables_initializer().run()

    start = global_step.eval() # get last global_step
    print("Start from:", start)

    for i in range(start, end):
      batch = mnist.train.next_batch(100)
      if i%10 == 0:
        train_accuracy = accuracy.eval(session=sess,feed_dict={
            x:batch[0], y_: batch[1], keep_prob: 1.0})
        print ("step %d, training accuracy %g"%(i, train_accuracy))

      train_step.run(session=sess,feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

      global_step.assign(i).eval()  #i更新global_step.
      saver.save(sess, ckpt_dir + "/model.ckpt", global_step=global_step)

    print ("test accuracy %g"%accuracy.eval(session=sess,feed_dict={
        x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

训练16000次后，模型的准确率如图2：

模型测试

在第三步中完成了对模型的训练，本步骤中，笔者将对训练的模型进行效果测试，通过传入一张手写的数字图像，输出结果为模型对传入图像的识别结果，测试的图像可以从第二步中给出百度网盘地址下载。测试代码如下：（注：代码需要传入一个参数，即测试图像路径）

# coding=utf-8
import tensorflow as tf
import os
import numpy as np
import sys
from PIL import Image#pillow(PIL)

x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])

W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

#权重初始化
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

#第一层卷积
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1,28,28,1])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

#d第二层卷积
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#全连接层
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#输出层
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

#训练和评估模型
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

ckpt_dir = "./ckpt_dir"

saver = tf.train.Saver()
with tf.Session() as sess:
    ckpt = tf.train.get_checkpoint_state(ckpt_dir)
    if ckpt and ckpt.model_checkpoint_path:
        print(ckpt.model_checkpoint_path)
        saver.restore(sess, ckpt.model_checkpoint_path) # restore all variables
    else:
        raise FileNotFoundError("未找到模型")#raise 引发异常

    # image_path="D:\\1.png"
    # image_path="/home/mnist/mnist03/test_data/"+sys.argv[1]+".png"
    image_path=sys.argv[1]
    print(image_path)
    img = Image.open(image_path).convert('L')#灰度图(L)
    img_shape = np.reshape(img, 784)
    real_x = np.array([1-img_shape])# 0-255 uint8   8位无符号整数，取值：[0, 255] 如果采用1-大数变成小数
    y = sess.run(y_conv, feed_dict={x: real_x,keep_prob: 1.0}) #y类似一个二维表，因为只有一张图片所以只有一行，y[0]包含10个值，

    print('Predict digit', np.argmax(y[0]))#找出最大的值

效果展示

任意选择0至9的手写数字图片（图片像素大小28*28）传入到模型测试部分可以查看测试效果。本文笔者仅展示共4个手写数字的识别效果，大家可以测试更多（如需要测试图片请后台留联系方式）。

图3 测验数

图4 测验效果

由图3、4可以看出，4个手写字模型都准确的预测了出来，大家也可以尝试手写一张数字图像，并通过OpenCV处理转为28*28像素，进行识别。如果懒得动手写demo，请等待笔者的下一篇文章。

参考文献：

1.http://yann.lecun.com/exdb/mnist/

2.http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_tf.html

公众号历史精选文章：

深度学习（Deep Learning）资料大全（不断更新）

Deep Learning（深度学习）学习笔记之系列（一）

持续更新ing

用TensorFlow教你手写字识别的更多相关文章

TensorFlow 入门之手写识别(MNIST) softmax算法
TensorFlow 入门之手写识别(MNIST) softmax算法 MNIST flyu6 softmax回归 softmax回归算法 TensorFlow实现softmax softmax回归算 ...
TensorFlow MNIST（手写识别 softmax）实例运行
TensorFlow MNIST(手写识别 softmax)实例运行首先要有编译环境,并且已经正确的编译安装,关于环境配置参考:http://www.cnblogs.com/dyufei/p/802 ...
TensorFlow 入门之手写识别CNN 三
TensorFlow 入门之手写识别CNN 三 MNIST 卷积神经网络 Fly 多层卷积网络多层卷积网络的基本理论构建一个多层卷积网络权值初始化卷积和池化第一层卷积第二层卷积密集层连接 ...
TensorFlow 入门之手写识别(MNIST) softmax算法二
TensorFlow 入门之手写识别(MNIST) softmax算法二 MNIST Fly softmax回归 softmax回归算法 TensorFlow实现softmax softmax回归算 ...
TensorFlow 入门之手写识别(MNIST) 数据处理一
TensorFlow 入门之手写识别(MNIST) 数据处理一 MNIST Fly softmax回归准备数据解压与重构手写识别入门 MNIST手写数据集图片以及标签的数据格式处理准备 ...
knn算法手写字识别案例
import pandas as pd import numpy as np import matplotlib.pyplot as plt import os from sklearn.neighb ...
tensorflow卷积神经网络与手写字识别
1.知识点 """ 基础知识: 1.神经网络(neural networks)的基本组成包括输入层.隐藏层.输出层.而卷积神经网络的特点在于隐藏层分为卷积层和池化层(po ...
基于tensorflow的MNIST手写识别
这个例子,是学习tensorflow的人员通常会用到的,也是基本的学习曲线中的一环.我也是! 这个例子很简单,这里,就是简单的说下,不同的tensorflow版本,相关的接口函数,可能会有不一样哟.在 ...
使用tensorflow实现mnist手写识别(单层神经网络实现)
import tensorflow as tf import tensorflow.examples.tutorials.mnist.input_data as input_data import n ...

随机推荐

【转】Python+opencv利用sobel进行边缘检测（细节讲解）
#! usr/bin/env python # coding:utf-8 # 2018年7月2日06:48:35 # 2018年7月2日23:11:59 import cv2 import numpy ...
771. Jewels and Stones
You're given strings J representing the types of stones that are jewels, and S representing the ston ...
oracle 安装提示未找到文件安装
安装oracle 过程中提示未找到文件 E:\app\xxj\product\11.2.0\dbhome_1\owb\external\oc4j_applications\applications\W ...
XSS之偷梁换柱--盲打垃圾短信平台
https://www.t00ls.net/thread-49742-1-1.html
javascript基础（Array）
1,join() Array.join(),不改变原数组,将数组中所有元素转换为字符串并连接在一起,返回最后生成的字符串 let a=[1,2,3]; a.join(); // =>" ...
js面向对象和php面向对象的区别
---恢复内容开始--- js的面向对象 1.类具体相同的特征的一些对象的集合. 2.对象具体到某一个失误了都可以叫做对象. 3.类通过function 定义类所以在js里类的本质是函数, ...
SEED实验——Environment Variable and Set-UID Program实验报告
任务一:操作环境变量实验过程一: 用printenv或env打印出环境变量. 在终端输入命令,显示结果如下图所示: 经过实验发现,printenv和env均可输出当前系统的环境变量.不同的是prin ...
LeetCode编程训练 - 回溯(Backtracking)
回溯基础先看一个使用回溯方法求集合子集的例子(78. Subsets),以下代码基本说明了回溯使用的基本框架: //78. Subsets class Solution { private: voi ...
实现CSS隐藏滚动条并可以滚动内容
隐藏滚动条的同时还需要支持滚动,我们经常在前端开发中遇到这种情况,最容易想到的是加一个iscroll插件,但其实现在CSS也可以实现这个功能,我已经在很多地方使用了,下面一起看看这三种方法. 方法1: ...
webpack 4.0 中 clean-webpack-plugin 的使用
其实 clean-webpack-plugin 很容易知道它的作用,就是来清除文件的. 一般这个插件是配合 webpack -p 这条命令来使用,就是说在为生产环境编译文件的时候,先把 build或d ...

用TensorFlow教你手写字识别

用TensorFlow教你手写字识别的更多相关文章

随机推荐

热门专题