Keras-在预训练好网络模型上进行fine-tune

在深度学习的学习过程中，可能会用到一些已经训练好的模型，比如Alex Net，google Net，VGG,Resnet等，那我们怎样对这些训练好的模型进行fine-tune来提高准确率呢？

参考文章：https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

使用已经训练好的VGG16模型来帮助我们进行这个分类任务，因为要分类的是猫，狗这类物体，而VGG net是在ImageNet上训练的，而imageNet实际上已经包含了这2种物体（猫，狗）了。

方法

首先载入VGG-16的权重

接下来在初始化好的VGG网络上添加我们预训练好的模型

最后将最后一个卷积块的层数冻结，然后以很低的学习率开始训练（我们只选择最后一个卷积块进行训练，因为训练样本很少，而VGG模型层数很多，全部训练肯定不能训练好，会过拟合）。其次fine-tune是由于在一个已经训练好的模型上进行的，故权值更新应该是一个小范围的，以免破坏预训练好的特征。

首先构造VGG16模型

# build the VGG16 network

model = Sequential()

model.add(ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height)))

model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))

model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))

model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))

model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))

model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))

model.add(ZeroPadding2D((1, 1)))

model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))

model.add(MaxPooling2D((2, 2), strides=(2, 2)))

加载VGG16训练好的权重（我们只要全连接以前的权重）：

# load the weights of the VGG16 networks

# (trained on ImageNet, won the ILSVRC competition in 2014)

# note: when there is a complete match between your model definition

# and your weight savefile, you can simply call model.load_weights(filename)

assert os.path.exists(weights_path), 'Model weights not found (see "weights_path" variable in script).'

f = h5py.File(weights_path)

for k in range(f.attrs['nb_layers']):

    if k >= len(model.layers):

        # we don't look at the last (fully-connected) layers in the savefile

        break

    g = f['layer_{}'.format(k)]

    weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]

    model.layers[k].set_weights(weights)

f.close()

print('Model loaded.')

然后再VGG16结构基础上添加一个简单的分类器及预训练好的模型：

# build a classifier model to put on top of the convolutional model

top_model = Sequential()

top_model.add(Flatten(input_shape=model.output_shape[1:]))

top_model.add(Dense(256, activation='relu'))

top_model.add(Dropout(0.5))

top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained

# classifier, including the top classifier,

# in order to successfully do fine-tuning

top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base

model.add(top_model)

把随后一个卷积块前的权重设置为不训练：

# set the first 25 layers (up to the last conv block)

# to non-trainable (weights will not be updated)

for layer in model.layers[:25]:

    layer.trainable = False

# compile the model with a SGD/momentum optimizer

# and a very slow learning rate.

model.compile(loss='binary_crossentropy',

              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),

              metrics=['accuracy'])

这样一个很简单的fine-tune在50个epoch后就可以达到一个大概0.94的accuracy

Keras-在预训练好网络模型上进行fine-tune的更多相关文章

学习AI之NLP后对预训练语言模型——心得体会总结
一.学习NLP背景介绍: 从2019年4月份开始跟着华为云ModelArts实战营同学们一起进行了6期关于图像深度学习的学习,初步了解了关于图像标注.图像分类.物体检测,图像都目标物体检测等 ...
知识增强的预训练语言模型系列之ERNIE：如何为预训练语言模型注入知识
NLP论文解读 |杨健论文标题: ERNIE:Enhanced Language Representation with Informative Entities 收录会议:ACL 论文链接: ht ...
在Keras模型中one-hot编码,Embedding层,使用预训练的词向量/处理图片
最近看了吴恩达老师的深度学习课程,又看了python深度学习这本书,对深度学习有了大概的了解,但是在实战的时候, 还是会有一些细枝末节没有完全弄懂,这篇文章就用来总结一下用keras实现深度学习算法的 ...
VGG16等keras预训练权重文件的下载及本地存放
VGG16等keras预训练权重文件的下载: https://github.com/fchollet/deep-learning-models/releases/ .h5文件本地存放目录: Linux ...
AI：拿来主义——预训练网络（二）
上一篇文章我们聊的是使用预训练网络中的一种方法,特征提取,今天我们讨论另外一种方法,微调模型,这也是迁移学习的一种方法. 微调模型为什么需要微调模型?我们猜测和之前的实验,我们有这样的共识,数据量越 ...
从Word Embedding到Bert模型—自然语言处理中的预训练技术发展史（转载）
转载 https://zhuanlan.zhihu.com/p/49271699 首发于深度学习前沿笔记写文章从Word Embedding到Bert模型—自然语言处理中的预训练技术发展史张 ...
pytorch预训练
Pytorch预训练模型以及修改 pytorch中自带几种常用的深度学习网络预训练模型,torchvision.models包中包含alexnet.densenet.inception.resnet. ...
第二十四节，TensorFlow下slim库函数的使用以及使用VGG网络进行预训练、迁移学习(附代码)
在介绍这一节之前,需要你对slim模型库有一些基本了解,具体可以参考第二十二节,TensorFlow中的图片分类模型库slim的使用.数据集处理,这一节我们会详细介绍slim模型库下面的一些函数的使用 ...
BERT总结：最先进的NLP预训练技术
BERT(Bidirectional Encoder Representations from Transformers)是谷歌AI研究人员最近发表的一篇论文:BERT: Pre-training o ...

随机推荐

PHP截断函数mb_substr()
提示:mb_substr在于php中是默认不被支持的我们需要在在windows目录下找到php.ini打开编辑,搜索mbstring.dll,找到;extension=php_mbstring.dll ...
Struts2 ajax json使用介绍
一.jar包首先引入Struts和json所需的jar包. watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvaXRteWhvbWUxOTkw/font/5a6 ...
js时间转化
const defaultTicks = 621355968000000000; export function convertDateToTicks(date = new Date()) { ret ...
Buff系统的实现
BUFF是很多游戏都在采用的一种临时增益机制.本文讲述如何在基于关系型数据库的网页游戏中实现这一系统:如何扩展该系统:以及如何提高该系统的性能. 引言 BUFF是很多游戏都在采用的一种临时增益机制:与 ...
vs2013\2015-UML
1.UML简介Unified Modeling Language (UML)又称统一建模语言或标准建模语言. 简单说就是以图形方式表现模型,根据不同模型进行分类,在UML 2.0中有13种图,以下是他 ...
oracle如何将am,pm时间字符串改为时间格式
问题: 解决办法: 1.param["OPT_DATE"] = DateTime.Parse(dt.Rows[0]["CREATED_ON"].ToString ...
Android 6.0启动过程具体解析
在之前的一篇文章中.从概念上学习了Andoird系统的启动过程.Android系统启动过程学习而在这篇文章中,我们将从代码角度细致学习Android系统的启动过程,同一时候,学习Android启动过 ...
超全面的JavaWeb笔记day15<mysql数据库>
1.数据库的概述 2.SQL 3.DDL 4.DML 5.DCL 6.DQL MySQL 数据库 1 数据库概念(了解) 1.1 什么是数据库数据库就是用来存储和管理数据的仓库! 数据库存储数据的优 ...
POJ 3211 Washing Cloths(01背包变形)
Q: 01背包最后返回什么 dp[v], v 是多少? A: 普通01背包需要遍历, 从大到小. 但此题因为物品的总重量必定大于背包容量, 所以直接返回 dp[V] 即可 update 2014年3月 ...
Ext3.4-EXT之嵌套表格的实现
其中使用到的"RowExpander.js"为extjs官方示例中自带的. 实现这个嵌套表格要注意两点技巧: 1 提供给外层表格的dataStore的数据源以嵌套数组的形式表示细节 ...

Keras-在预训练好网络模型上进行fine-tune

方法

Keras-在预训练好网络模型上进行fine-tune的更多相关文章

随机推荐

热门专题