卷积神经网络入门：LeNet5（手写体数字识别）详解

第一张图包括8层LeNet5卷积神经网络的结构图，以及其中最复杂的一层S2到C3的结构处理示意图。

第二张图及第三张图是用tensorflow重写LeNet5网络及其注释。

这是原始的LeNet5网络：

import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data

import time

# 声明输入图片数据，类别

x = tf.placeholder('float', [None, 784])

y_ = tf.placeholder('float', [None, 10])

# 输入图片数据转化

x_image = tf.reshape(x, [-1, 28, 28, 1])

#第一层卷积层，初始化卷积核参数、偏置值，该卷积层5*5大小，一个通道，共有6个不同卷积核

filter1 = tf.Variable(tf.truncated_normal([5, 5, 1, 6]))

bias1 = tf.Variable(tf.truncated_normal([6]))

conv1 = tf.nn.conv2d(x_image, filter1, strides=[1, 1, 1, 1], padding='SAME')

h_conv1 = tf.nn.sigmoid(conv1 + bias1)

maxPool2 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')

filter2 = tf.Variable(tf.truncated_normal([5, 5, 6, 16]))

bias2 = tf.Variable(tf.truncated_normal([16]))

conv2 = tf.nn.conv2d(maxPool2, filter2, strides=[1, 1, 1, 1], padding='SAME')

h_conv2 = tf.nn.sigmoid(conv2 + bias2)

maxPool3 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')

filter3 = tf.Variable(tf.truncated_normal([5, 5, 16, 120]))

bias3 = tf.Variable(tf.truncated_normal([120]))

conv3 = tf.nn.conv2d(maxPool3, filter3, strides=[1, 1, 1, 1], padding='SAME')

h_conv3 = tf.nn.sigmoid(conv3 + bias3)

# 全连接层

# 权值参数

W_fc1 = tf.Variable(tf.truncated_normal([7 * 7 * 120, 80]))

# 偏置值

b_fc1 = tf.Variable(tf.truncated_normal([80]))

# 将卷积的产出展开

h_pool2_flat = tf.reshape(h_conv3, [-1, 7 * 7 * 120])

# 神经网络计算，并添加sigmoid激活函数

h_fc1 = tf.nn.sigmoid(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# 输出层，使用softmax进行多分类

W_fc2 = tf.Variable(tf.truncated_normal([80, 10]))

b_fc2 = tf.Variable(tf.truncated_normal([10]))

y_conv = tf.nn.softmax(tf.matmul(h_fc1, W_fc2) + b_fc2)

# 损失函数

cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))

# 使用GDO优化算法来调整参数

train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)

sess = tf.InteractiveSession()

# 测试正确率

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

# 所有变量进行初始化

sess.run(tf.initialize_all_variables())

# 获取mnist数据

mnist_data_set = input_data.read_data_sets('MNIST_data', one_hot=True)

# 进行训练

start_time = time.time()

for i in range(20000):

    # 获取训练数据

    batch_xs, batch_ys = mnist_data_set.train.next_batch(200)

    # 每迭代100个 batch，对当前训练数据进行测试，输出当前预测准确率

    if i % 2 == 0:

        train_accuracy = accuracy.eval(feed_dict={x: batch_xs, y_: batch_ys})

        print("step %d, training accuracy %g" % (i, train_accuracy))

        # 计算间隔时间

        end_time = time.time()

        print('time: ', (end_time - start_time))

        start_time = end_time

    # 训练数据

    train_step.run(feed_dict={x: batch_xs, y_: batch_ys})

# 关闭会话

sess.close()

下面是改进后的LeNet5网络：

import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data

import time

import matplotlib.pyplot as plt

# 初始化单个卷积核上的权重

def weight_variable(shape):

    initial = tf.truncated_normal(shape, stddev=0.1)

    return tf.Variable(initial)

# 初始化单个卷积核上的偏置值

def bias_variable(shape):

    initial = tf.constant(0.1, shape=shape)

    return tf.Variable(initial)

# 输入特征x，用卷积核W进行卷积运算，strides为卷积核移动步长，

# padding表示是否需要补齐边缘像素使输出图像大小不变

def conv2d(x, W):

    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

# 对x进行最大池化操作，ksize进行池化的范围，

def max_pool_2x2(x):

    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

sess = tf.InteractiveSession()

# 声明输入图片数据，类别

x = tf.placeholder('float32', [None, 784])

y_ = tf.placeholder('float32', [None, 10])

# 输入图片数据转化

x_image = tf.reshape(x, [-1, 28, 28, 1])

W_conv1 = weight_variable([5, 5, 1, 32])

b_conv1 = bias_variable([32])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)

h_pool1 = max_pool_2x2(h_conv1)

W_conv2 = weight_variable([5, 5, 32, 64])

b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

h_pool2 = max_pool_2x2(h_conv2)

W_fc1 = weight_variable([7 * 7 * 64, 1024])

# 偏置值

b_fc1 = bias_variable([1024])

# 将卷积的产出展开

h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])

# 神经网络计算，并添加relu激活函数

h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

W_fc2 = weight_variable([1024, 128])

b_fc2 = bias_variable([128])

h_fc2 = tf.nn.relu(tf.matmul(h_fc1, W_fc2) + b_fc2)

W_fc3 = weight_variable([128, 10])

b_fc3 = bias_variable([10])

y_conv = tf.nn.softmax(tf.matmul(h_fc2, W_fc3) + b_fc3)

# 代价函数

cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))

# 使用Adam优化算法来调整参数

train_step = tf.train.GradientDescentOptimizer(1e-5).minimize(cross_entropy)

# 测试正确率

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float32"))

# 所有变量进行初始化

sess.run(tf.initialize_all_variables())

# 获取mnist数据

mnist_data_set = input_data.read_data_sets('MNIST_data', one_hot=True)

c = []

# 进行训练

start_time = time.time()

for i in range(1000):

    # 获取训练数据

    batch_xs, batch_ys = mnist_data_set.train.next_batch(200)

    # 每迭代10个 batch，对当前训练数据进行测试，输出当前预测准确率

    if i % 2 == 0:

        train_accuracy = accuracy.eval(feed_dict={x: batch_xs, y_: batch_ys})

        c.append(train_accuracy)

        print("step %d, training accuracy %g" % (i, train_accuracy))

        # 计算间隔时间

        end_time = time.time()

        print('time: ', (end_time - start_time))

        start_time = end_time

    # 训练数据

    train_step.run(feed_dict={x: batch_xs, y_: batch_ys})

sess.close()

plt.plot(c)

plt.tight_layout()

卷积神经网络入门：LeNet5（手写体数字识别）详解的更多相关文章

利用c++编写bp神经网络实现手写数字识别详解
利用c++编写bp神经网络实现手写数字识别写在前面从大一入学开始,本菜菜就一直想学习一下神经网络算法,但由于时间和资源所限,一直未展开比较透彻的学习.大二下人工智能课的修习,给了我一个学习的契机. ...
TensorFlow卷积神经网络实现手写数字识别以及可视化
边学习边笔记 https://www.cnblogs.com/felixwang2/p/9190602.html # https://www.cnblogs.com/felixwang2/p/9190 ...
卷积神经网络CNN 手写数字识别
1. 知识点准备在了解 CNN 网络神经之前有两个概念要理解,第一是二维图像上卷积的概念,第二是 pooling 的概念. a. 卷积关于卷积的概念和细节可以参考这里,卷积运算有两个非常重要特性, ...
基于卷积神经网络的手写数字识别分类(Tensorflow)
import numpy as np import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_dat ...
TensorFlow(十)：卷积神经网络实现手写数字识别以及可视化
上代码: import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = inpu ...
莫烦pytorch学习笔记（八）——卷积神经网络（手写数字识别实现）
莫烦视频网址这个代码实现了预测和可视化 import os # third-party library import torch import torch.nn as nn import torch ...
keras与卷积神经网络（CNN）实现识别minist手写数字
在本篇博文当中,笔者采用了卷积神经网络来对手写数字进行识别,采用的神经网络的结构是:输入图片——卷积层——池化层——卷积层——池化层——卷积层——池化层——Flatten层——全连接层(64个神经元) ...
技术干货丨卷积神经网络之LeNet-5迁移实践案例
摘要:LeNet-5是Yann LeCun在1998年设计的用于手写数字识别的卷积神经网络,当年美国大多数银行就是用它来识别支票上面的手写数字的,它是早期卷积神经网络中最有代表性的实验系统之一.可以说 ...
CNN卷积神经网络入门整合（科普向）
这是一篇关于CNN入门知识的博客,基本手法是抄.删.改.查,就算是自己的一个笔记吧,以后忘了多看看. 1.边界检测示例假如你有一张如下的图像,你想让计算机搞清楚图像上有什么物体,你可以做的事情是检 ...
Python 3 利用机器学习模型进行手写体数字识别
0.引言介绍了如何生成数据,提取特征,利用sklearn的几种机器学习模型建模,进行手写体数字1-9识别. 用到的四种模型: 1. LR回归模型,Logistic Regression 2. SGD ...

随机推荐

vuex数据管理－数据模块化
对于vue这类mvvm框架来说,其核心就是组件与数据,因此做好相应的数据管理极为重要.这里分享下vuex数据模块化管理的方法,有利于搭建便于维护.协作的vue项目. vuex管理基本方法和使用模块化 ...
java中变量使用的总结
java中整数默认为int,小数默认为double. float n5=1.3; 这个句子就会报错,应该修改成这样float n5=1.3f; 八大基本类型变量类型位数范围备注 Byte 8位 ...
并发编程之 CAS 的原理
前言在并发编程中,锁是消耗性能的操作,同一时间只能有一个线程进入同步块修改变量的值,比如下面的代码 synchronized void function(int b){ a = a + b: } 如 ...
浅析Java源码之HashMap
写这篇文章还是下了一定决心的,因为这个源码看的头疼得很. 老规矩,源码来源于JRE1.8,java.util.HashMap,不讨论I/O及序列化相关内容. 该数据结构简介:使用了散列码来进行快速搜索 ...
浅析Java源码之Math.random()
从零自学java消遣一下,看书有点脑阔疼,不如看看源码!(๑╹◡╹)ﾉ""" JS中Math调用的都是本地方法,底层全是用C++写的,所以完全无法观察实现过程,Jav ...
表格(Table)隔行变色
在ASP.NET的Repeater控件,实现隔行变色,是极简单的事情.因为它有ListItemType.Item和ListItemType.AlternatingItem模版.如果在普通的表格(Tab ...
MySQL练习题及答案
一.现有三张数据库表,分别为部门表.员工表.部门和员工关系表 1.部门表CREATE TABLE `t_dept` ( `id` int(8) NOT NULL AUTO_INCREMENT, `de ...
Android-Messenger跨进程通信
http://blog.csdn.net/lmj623565791/article/details/47017485 一.概述我们可以在客户端发送一个Message给服务端,在服务端的handler ...
MapReduce运行原理和过程
原文一．Map的原理和运行流程 Map的输入数据源是多种多样的,我们使用hdfs作为数据源.文件在hdfs上是以block(块,Hdfs上的存储单元)为单位进行存储的. 1.分片我们将这一个个bl ...
Vue：模板&渲染函数学习
模板&渲染函数区别: 1.代码量:模板代码重复逐行拼写,渲染函数可以迭代拼接方式实现重复代码. 2.函数式组件中应用:基于模板的函数式组件需要手动添加特性和事件,给予渲染函数的函数是组件使用c ...

卷积神经网络入门：LeNet5（手写体数字识别）详解

卷积神经网络入门：LeNet5（手写体数字识别）详解的更多相关文章

随机推荐

热门专题