VGGnet——从TFrecords制作到网络训练

作为一个小白中的小白，多折腾总是有好处的，看了入门书和往上一些教程，很多TF的教程都是从MNIST数据集入手教小白入TF的大门，都是直接import MNIST，然后直接构建网络，定义loss和optimizer，设置超参数，之后就直接sess.run()了，虽然操作流程看上去很简单，但如果直接给自己一堆图片，如何让tensorflow读取，如何喂入网络进行训练，这些都不清楚，所以作为小白，先从最简单的CNN——VGGnet入手吧，在网上随便下载了个数据集——GTSRB（因为这个数据集最小，下载快。。= =），下载下来的数据的前处理已经在另一篇博文数据图片处理介绍，这篇主要是TFrecords文件的制作和读取，我不是CS专业，研究方向也跟这个毫不相关，（刚入学时和导师约定好的计算机视觉方向现在被否了，一度让我想换导师，说来话长，此处省略一万字），一边要忙导师那边的东西，一边搞这个，可以说是很酸爽了 = =。。。这个程序折腾了近2个星期，最后可算是制服所有八阿哥，成功运行了，进入了所谓的“调参”环节，目前还很不理想，也许下面的程序还存在错误，但对于我这个小白来讲这次折腾已经学到很多东西了。

下面进入正题。。。

TFrecords文件是tensorflow读取数据的方式之一，主要用于数据较大的情况，TFRecords文件包含了tf.train.Example 协议内存块(protocol buffer)(协议内存块包含了字段 Features)，

可以将自己的数据填入到Example协议内存块(protocol buffer)，将协议内存块序列化为一个字符串，并且通过tf.python_io.TFRecordWriter 写入到TFRecords文件。

从TFRecords文件中读取数据，可以使用tf.TFRecordReader的tf.parse_single_example解析器。这个操作可以将Example协议内存块(protocol buffer)解析为张量。

上面的内容来自：https://www.cnblogs.com/upright/p/6136265.html

下面直接贴代码吧，有些部分并非原创，很多说明都写在代码中了（好吧，我承认我懒。。= =，这篇以后会更新的）

VGGnet.py：

 # -*- coding: utf-8 -*-

 import tensorflow as tf

 import time

 import convert_TFrecords

 # 网络超参数

 learning_rate = 0.005

 batch_size = 300

 epoch = 20000

 display_step = 10

 # 网络参数

 Dropout = 0.75  # 失活的概率=1-Dropout

 n_inputs = 128 * 128 * 3  # 输入维度(img_size)

 n_classes = 43

 weights = {'w1': tf.Variable(tf.random_normal([3, 3, 3, 16])),

            'w2': tf.Variable(tf.random_normal([3, 3, 16, 16])),

            'w3': tf.Variable(tf.random_normal([3, 3, 16, 32])),

            'w4': tf.Variable(tf.random_normal([3, 3, 32, 32])),

            'w5': tf.Variable(tf.random_normal([3, 3, 32, 64])),

            'w6': tf.Variable(tf.random_normal([3, 3, 64, 64])),

            'w7': tf.Variable(tf.random_normal([3, 3, 64, 128])),

            'w8': tf.Variable(tf.random_normal([3, 3, 128, 128])),

            'w9': tf.Variable(tf.random_normal([3, 3, 128, 128])),

            'w10': tf.Variable(tf.random_normal([3, 3, 128, 128])),

            'wd1': tf.Variable(tf.random_normal([8*8*128, 4096])),

            'wd2': tf.Variable(tf.random_normal([1*1*4096, 4096])),

            'out': tf.Variable(tf.random_normal([4096, 43]))}  # 共43个类别

 biases = {'b1': tf.Variable(tf.random_normal([16])),

           'b2': tf.Variable(tf.random_normal([16])),

           'b3': tf.Variable(tf.random_normal([32])),

           'b4': tf.Variable(tf.random_normal([32])),

           'b5': tf.Variable(tf.random_normal([64])),

           'b6': tf.Variable(tf.random_normal([64])),

           'b7': tf.Variable(tf.random_normal([128])),

           'b8': tf.Variable(tf.random_normal([128])),

           'b9': tf.Variable(tf.random_normal([128])),

           'b10': tf.Variable(tf.random_normal([128])),

           'bd1': tf.Variable(tf.random_normal([4096])),

           'bd2': tf.Variable(tf.random_normal([4096])),

           'out': tf.Variable(tf.random_normal([43]))}

 def conv(name, input, W, b, strides=1, padding='SAME'):

     x = tf.nn.conv2d(input, W, strides=[1, strides, strides, 1], padding=padding)

     x = tf.nn.bias_add(x, b)

     return tf.nn.relu(x, name=name)

 # 输入应该是一个4维的张量，最后一维为batch_size，但这里构造的网络只按batch_size=1的情况来构造，即只考虑

 # 一个样本的情况，这是没有影响的，运行图的时候再指定batch_size

 def VGGnet(input, weights, biases, keep_prob):

     x = tf.reshape(input, shape=[-1, 128, 128, 3])   # -1处的值由batch_size决定

     conv1 = conv('conv1', x, weights['w1'], biases['b1'])

     conv2 = conv('conv2', conv1, weights['w2'], biases['b1'])

     pool1 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool1')

     conv3 = conv('conv3', pool1, weights['w3'], biases['b3'])

     conv4 = conv('conv4', conv3, weights['w4'], biases['b4'])

     pool2 = tf.nn.max_pool(conv4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool2')

     conv5 = conv('conv5', pool2, weights['w5'], biases['b5'])

     conv6 = conv('conv6', conv5, weights['w6'], biases['b6'])

     conv7 = conv('conv7', conv6, weights['w7'], biases['b7'])

     pool3 = tf.nn.max_pool(conv7, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool3')

     conv8 = conv('conv8', pool3, weights['w8'], biases['b8'])

     conv9 = conv('conv9', conv8, weights['w9'], biases['b9'])

     conv10 = conv('conv10', conv9, weights['w10'], biases['b10'])

     pool4 = tf.nn.max_pool(conv10, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool4')

     fc1 = tf.reshape(pool4, [-1, weights['wd1'].get_shape().as_list()[0]])

     fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])

     re1 = tf.nn.relu(fc1, 're1')

     drop1 = tf.nn.dropout(re1, keep_prob)

     fc2 = tf.reshape(drop1, [-1, weights['wd2'].get_shape().as_list()[0]])

     fc2 = tf.add(tf.matmul(fc2, weights['wd2']), biases['bd2'])

     re2 = tf.nn.relu(fc2, 're2')

     drop2 = tf.nn.dropout(re2, keep_prob)

     fc3 = tf.reshape(drop2, [-1, weights['out'].get_shape().as_list()[0]])

     fc3 = tf.add(tf.matmul(fc3, weights['out']), biases['out'])

     # print(fc3) 检查点

     # tf.nn.softmax_cross_entropy_with_logits函数已经进行了softmax处理！不必再加一层softmax（发现这个错误后，训练精度终于变得正常）

     # sm = tf.nn.softmax(fc3)

     return fc3

 # 注意下面的shape要和传入的tensor一致！使用mnist数据集时x的shape为[none, 28*28*1]，是因为传入的数据是展开成行的

 x = tf.placeholder(tf.float32, [None, 128, 128, 3])

 y = tf.placeholder(tf.float32, [None, n_classes])

 dropout = tf.placeholder(tf.float32)

 pred = VGGnet(x, weights, biases, dropout)

 # 定义损失函数和优化器

 # 错误：Only call `softmax_cross_entropy_with_logits` with named arguments (labels=..., logits=...,)，解决方法：参数要以关键字参数的形式传入

 # tf.nn.softmax_cross_entropy_with_logits先是对最后一层输出做一个softmax，然后求softmax向量里每个元素的这个值：y_i * log(yi)（y_i为实际值，yi为预测值），

 # tf.reduce_mean对每个元素上面的乘积求和再平均

 # 参考：https://blog.csdn.net/mao_xiao_feng/article/details/53382790

 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))

 optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)

 # 评估函数

 # tf.argmax()返回每个向量最大元素的索引(axis=1)，tf.equal()返回两个数是否相等（ture or false）

 # https://blog.csdn.net/qq575379110/article/details/70538051/

 # https://blog.csdn.net/uestc_c2_403/article/details/72232924

 correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))

 accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

 # init = tf.initialize_all_variables()

 batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch)

 with tf.Session() as sess:

     # sess.run(init)

     # 先执行初始化工作

     # 参考：https://blog.csdn.net/lujiandong1/article/details/53376802

     sess.run(tf.global_variables_initializer())

     sess.run(tf.local_variables_initializer())

     # sess.run(tf.initialize_all_variables())

     # 开启一个协调器

     coord = tf.train.Coordinator()

     # 使用start_queue_runners 启动队列填充

     threads = tf.train.start_queue_runners(sess, coord)

     try:

         step = 1

         while not coord.should_stop():

             # 获取每一个batch中batch_size个样本和标签

             # 原来下面这一句放在这个位置（改变这一句的位置后卡了几天的问及终于解决了）：

             # batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch)

             # 结果程序卡住，无法运行，也不报错

             # 检查点：print('kaka')

             # print(batch_x)

             # print(batch_y)

             # print('okok') 检查点

             # 没有下面这句会报错：

             # The value of a feed cannot be a tf.Tensor object. Acceptable feed

             # values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles.

             # 原以为是要用tensor.eval()将tensor转为np.array，但batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch)

             # 那时是放在sess里面，所以执行到tensor.eval()时一样会卡住不动

             b_x, b_y = sess.run([batch_x, batch_y])

             # print('haha') 检查点

             # 打印出tesor:默认值打印出3个参数  参考：https://blog.csdn.net/qq_34484472/article/details/75049179

             # print(b_x, b_y) 检查点

             # 这里原先喂入dict的tensor变量名不是b_x,b_y，而是和key名一样（也就是x,y），变量名与占位符名冲突，结果

             # 会报错：unhashable type: 'numpy.ndarray' error

             # 这个错误也有可能是其他原因引起，见：https://blog.csdn.net/wongleetion/article/details/80885648

             start = time.time()

             sess.run(optimizer, feed_dict={x: b_x, y: b_y, dropout: Dropout})

             if step % display_step == 0:

                 # 原来在feed_dict里关键字dropout打错成keep_prob了，结果弹出Cannot interpret feed_dict key

                 # as Tensor：Can not convert a float into a Tensor错误

                 # 参考https://blog.csdn.net/ice_pill/article/details/78567841

                 Loss, acc = sess.run([loss, accuracy], feed_dict={x: b_x, y: b_y, dropout: 1.0})

                 print('iter ' + str(step) + ', minibatch loss = ' +

                       '{: .6f}'.format(Loss) + ', training accuracy = ' + '{: .5f}'.format(acc))

                 # sess.run(tf.Print(b_y, [b_y], summarize=43))

                 print(b_y)

             print('iter %d, duration: %.2fs' % (step, time.time() - start))

             step += 1

     except tf.errors.OutOfRangeError:  # 如果读取到文件队列末尾会抛出此异常

         print("done! now lets kill all the threads……")

     finally:

         # 协调器coord发出所有线程终止信号

         coord.request_stop()

         print('all threads are asked to stop!')

     coord.join(threads)  # 把开启的线程加入主线程，等待threads结束

     print('all threads are stopped!')

convert_TFrecords.py（TFrecords文件的制作和读取）:

 # -*- coding: utf-8 -*-

 import os

 import tensorflow as tf

 from PIL import Image

 cur_dir = os.getcwd()

 # classes = ['test_file_dir', 'train_file_dir']

 train_set = os.path.join(cur_dir, 'train_file_dir')

 classes = os.listdir(train_set)

 # 制作二进制数据

 def create_record():

     print('processing...')

     writer = tf.python_io.TFRecordWriter('train.tfrecords')

     num_labels = len([name for name in classes])

     print('num of classes: %d' % num_labels)

     label = [0] * num_labels

     for index, name in enumerate(classes):

         class_path = os.path.join(train_set, name)

         label[index] = 1

         for img_name in os.listdir(class_path):

             img_path = os.path.join(class_path, img_name)

             img = Image.open(img_path)

             # img = img.resize((64, 64))

             img_raw = img.tobytes()  # 将图片转化为原生bytes

             # print(img_raw)

             # print(index,img_raw)

             # tfrecord数据文件是一种将图像数据和标签统一存储的二进制文件，能更好的利用内存，在tensorflow中快速的复制，移动，读取，存储等。

             # tfrecord文件包含了tf.train.Example 协议缓冲区(protocol buffer，协议缓冲区包含了特征 Features)。你可以写一段代码获取你的数据，

             # 将数据填入到Example协议缓冲区(protocol buffer)，将协议缓冲区序列化为一个字符串， 并且通过tf.python_io.TFRecordWriter class写

             # 入到TFRecords文件

             example = tf.train.Example(

                 # feature字典中每个key的值都是一个list，这些list是3种数据类型中的一种：FloatList， 或者ByteList，或者Int64List

                 # 参考https://blog.csdn.net/u012759136/article/details/52232266

                 # 参考https://blog.csdn.net/shenxiaolu1984/article/details/52857437

                features=tf.train.Features(feature={

                     # 设置图片在TFrecord文件中的标签（同一文件夹下标签一致），注意存储的是一个大小为num_label的list，而不是一个值！！

                     'label': tf.train.Feature(int64_list=tf.train.Int64List(value=label)), # label本来就是一个list，不用加中括号

                     # 设置图片在TFrecord文件中的值

                     'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))

                }))

             writer.write(example.SerializeToString())

         label = [0] * num_labels

     writer.close()

     print('TFrecords file created successfully!!')

 # 读取二进制数据

 def read_and_decode(filename, num_epochs):

     # 根据文件名，顺序生成一个队列（如果shuffle=ture）

     filename_queue = tf.train.string_input_producer([filename], shuffle=True, num_epochs=num_epochs)

     print('qunide')

     reader = tf.TFRecordReader()

     _, serialized_example = reader.read(filename_queue)   # 返回文件名和文件

     features = tf.parse_single_example(serialized_example,

                                        features={

                                            # 这个函数不是很了解，原来在'label'里的shape为空（[]），结果弹出错误：Key: label, Index: 0.  Number

                                            # of int64 values != expected.  Values size: 43 but output shape: []

                                            # 注意数据类型要和TFrecords文件中一致！！

                                            'label': tf.FixedLenFeature([43], tf.int64),

                                            'img_raw': tf.FixedLenFeature([], tf.string),     ##########

                                        })

     img = features['img_raw']

     # decode_raw()函数只能用于解码byteslist格式的数据

     img = tf.decode_raw(img, tf.uint8)

     img = tf.reshape(img, [128, 128, 3])

     img = tf.cast(img, tf.float32) * (1. / 255) - 0.5     # 规范化到±0.5之间

     label = features['label']

     # label = tf.reshape(label, [43])   ????不用这样做，原本存储的时候shape就是[43]

     label = tf.cast(label, tf.float32)    # 因为网络输出的pred值是float32类型的！！(?)

     print('label', label)

     print('image', img)

     return img, label

 def inputs(train, batch_size, num_epochs):

     print('qunide2')

     if not num_epochs:

         num_epochs = None

     filename = os.path.join(cur_dir, 'train.tfrecords' if train else 'test.tfrecords')  # 暂时先这样

     with tf.name_scope('input'):

         image, label = read_and_decode(filename, num_epochs)

         # print(image) 检查点

         # tf.train.shuffle_batch应该是从tf.train.string_input_producer生成的文件队列中先打乱再从中抽取组成batch，所以

         # 这个打乱后的队列容量和min_after_dequeue（应该是决定原有队列被抽取后的最小样本含量，决定被抽取后再填入的量）

         # 根据batch_size的不同会影响训练精度（因为再填充并打乱后很多之前网络没见过的样本会被送入，当所有训练数据都过一遍后，精度会提高），这是我的个人猜测

         images, sparse_labels = tf.train.shuffle_batch([image, label], batch_size=batch_size,

                                                         num_threads=2, capacity=3000,  # 线程数一般与处理器核数一样

                                                        # 但并不是线程越多越快，甚至更多的线程反而会使效率下降

                                                        # 参考：https://blog.csdn.net/lujiandong1/article/details/53376802

                                                        # https://blog.csdn.net/heiheiya/article/details/80967301

                                                        min_after_dequeue=2000)

         # print(images) 检查点

         return images, sparse_labels

     # 注意返回值的类型要与tf.placeholder()中的dtypes, shape都要相同！

 if __name__ == '__main__':

     create_record()

虽然程序成功运行了，但训练精度很低，还有很多方面需要调整

除了代码中提到的博文，还参考了下面的：

https://blog.csdn.net/dcrmg/article/details/79780331

https://blog.csdn.net/qq_30666517/article/details/79715045

https://www.cnblogs.com/upright/p/6136265.html

https://blog.csdn.net/tengxing007/article/details/56847828

https://blog.csdn.net/ali197294332/article/details/78720309

https://blog.csdn.net/ying86615791/article/details/73864381

VGGnet——从TFrecords制作到网络训练的更多相关文章

图像分割实验：FCN数据集制作，网络模型定义，网络训练（提供数据集和模型文件，以供参考）
论文:<Fully Convolutional Networks for Semantic Segmentation> 代码:FCN的Caffe 实现数据集:PascalVOC 一数据 ...
MINIST深度学习识别：python全连接神经网络和pytorch LeNet CNN网络训练实现及比较（三）
版权声明:本文为博主原创文章,欢迎转载,并请注明出处.联系方式:460356155@qq.com 在前两篇文章MINIST深度学习识别:python全连接神经网络和pytorch LeNet CNN网 ...
Pytorch半精度浮点型网络训练问题
用Pytorch1.0进行半精度浮点型网络训练需要注意下问题: 1.网络要在GPU上跑,模型和输入样本数据都要cuda().half() 2.模型参数转换为half型,不必索引到每层,直接model. ...
卷积网络训练太慢？Yann LeCun：已解决CIFAR-10，目标 ImageNet
原文连接:http://blog.kaggle.com/2014/12/22/convolutional-nets-and-cifar-10-an-interview-with-yan-lecun/ ...
如何绘制caffe网络训练曲线
本系列文章由 @yhl_leo 出品,转载请注明出处. 文章链接: http://blog.csdn.net/yhl_leo/article/details/51774966 当我们设计好网络结构后, ...
Caffe-python interface 学习|网络训练、部署、測试
继续python接口的学习.剩下还有solver.deploy文件的生成和模型的測试. 网络训练 solver文件生成事实上我认为用python生成solver并不如直接写个配置文件,它不像net配 ...
Pytorch 分割模型构建和训练【直播】2019 年县域农业大脑AI挑战赛---(四)模型构建和网络训练
对于分割网络,如果当成一个黑箱就是:输入一个3x1024x1024 输出4x1024x1024. 我没有使用二分类,直接使用了四分类. 分类网络使用了SegNet,没有加载预训练模型,参数也是默认初始 ...
小白也能弄得懂的目标检测YOLO系列之YOLOv1网络训练
上期给大家介绍了YOLO模型的检测系统和具体实现,YOLO是如何进行目标定位和目标分类的,这期主要给大家介绍YOLO是如何进行网络训练的,话不多说,马上开始! 前言: 输入图片首先被分成S*S个网格c ...
Window10 上MindSpore(CPU)用LeNet网络训练MNIST
本文是在windows10上安装了CPU版本的Mindspore,并在mindspore的master分支基础上使用LeNet网络训练MNIST数据集,实践已训练成功,此文为记录过程中的出现问题: ( ...

随机推荐

eclipse导入maven工程missing artifact（实际是存在的）错误解决
找到出错的jar包文件位置,删掉_maven.repositories文件(或用文本编辑器打开,将“>main=”改为“>=”,即删除main,当然main也可能是其他值),然后updat ...
Yarn遭到挖矿病毒攻击
测试环境在阿里云上暴露出了公网端口,前一段时间CDH集群原本是开启了Kerberos认证,但是因为大家反映使用麻烦,所以就又关闭了Kerberos. 最近几天大家普遍反映测试环境上hive和hdfs ...
死磕salt系列-salt文章目录汇总
死磕salt系列-salt入门死磕salt系列-salt配置文件死磕salt系列-salt grains pillar 配置死磕salt系列-salt 常用modules 死磕salt系列-sa ...
Docker实战(八)之Web服务与应用
1.Apache 官方提供了名为httpd的Apache镜像,可以作为基础web服务镜像 Dockerfile(安装apache2) FROM httpd:2.4 COPY ./public-html ...
Connection reset原因分析和解决方案
在使用HttpClient调用后台resetful服务时,“Connection reset”是一个比较常见的问题,有同学跟我私信说被这个问题困扰很久了,今天就来分析下,希望能帮到大家.例如我们线上的 ...
C++矩阵库 Eigen 简介
最近需要用 C++ 做一些数值计算,之前一直采用Matlab 混合编程的方式处理矩阵运算,非常麻烦,直到发现了 Eigen 库,简直相见恨晚,好用哭了. Eigen 是一个基于C++模板的线性代数库, ...
UITextView 和 UITextField限制字符数和表情符号
UITextField限制字符数 - (BOOL)textField:(UITextField *)textField shouldChangeCharactersInRange:(NSRange)r ...
学习JavaWeb aop两种配置方式
aop aop:面向切面编程,它可以解决重复代码. aop有两种方式: 一..xml方式 1.在springmvc-servlet.xml中配置aop,应用bean文件: <!--aop配置-- ...
block本质探寻五之atuto类型局部实例对象
说明:阅读本文章,请参考之前的block文章加以理解: 一.栈区block分析 //代码 //ARC void test1() { { Person *per = [[Person alloc] in ...
x01.calc: 编程语言
想写终极程序,大都去写操作系统或编程语言了.编程语言可以极其复杂如C,也可以极简,只处理加减乘除如 calc. 1. 词法分析 %{ #include <stdio.h> #include ...

VGGnet——从TFrecords制作到网络训练

VGGnet——从TFrecords制作到网络训练的更多相关文章

随机推荐

热门专题