paper 147：Deep Learning -- Face Data Augmentation（一）

1. 在深度学习中，当数据量不够大时候，常常采用下面4中方法：

（1）人工增加训练集的大小. 通过平移, 翻转, 加噪声等方法从已有数据中创造出一批"新"的数据.也就是Data Augmentation

（2）Regularization. 数据量比较小会导致模型过拟合, 使得训练误差很小而测试误差特别大. 通过在Loss Function 后面加上正则项可以抑制过拟合的产生. 缺点是引入了一个需要手动调整的hyper-parameter. 详见https://www.wikiwand.com/en/Regularization_(mathematics)

（3）Dropout. 这也是一种正则化手段. 不过跟以上不同的是它通过随机将部分神经元的输出置零来实现. 详见http://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf

（4）Unsupervised Pre-training. 用Auto-Encoder或者RBM的卷积形式一层一层地做无监督预训练, 最后加上分类层做有监督的Fine-Tuning. 参考 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.207.1102&rep=rep1&type=pdf

下面我们来讨论Data Augmentation：

不同的任务背景下, 我们可以通过图像的几何变换, 使用以下一种或多种组合数据增强变换来增加输入数据的量. 这里具体的方法都来自数字图像处理的内容, 相关的知识点介绍, 网上都有, 就不一一介绍了．

旋转 | 反射变换(Rotation/reflection): 随机旋转图像一定角度; 改变图像内容的朝向;
翻转变换(flip): 沿着水平或者垂直方向翻转图像;
缩放变换(zoom): 按照一定的比例放大或者缩小图像;
平移变换(shift): 在图像平面上对图像以一定方式进行平移;
可以采用随机或人为定义的方式指定平移范围和平移步长, 沿水平或竖直方向进行平移. 改变图像内容的位置;
尺度变换(scale): 对图像按照指定的尺度因子, 进行放大或缩小; 或者参照SIFT特征提取思想, 利用指定的尺度因子对图像滤波构造尺度空间. 改变图像内容的大小或模糊程度;
对比度变换(contrast): 在图像的HSV颜色空间，改变饱和度S和V亮度分量，保持色调H不变. 对每个像素的S和V分量进行指数运算(指数因子在0.25到4之间), 增加光照变化;
噪声扰动(noise): 对图像的每个像素RGB进行随机扰动, 常用的噪声模式是椒盐噪声和高斯噪声;
颜色变换(color): 在训练集像素值的RGB颜色空间进行PCA, 得到RGB空间的3个主方向向量,3个特征值, p1, p2, p3, λ1, λ2, λ3. 对每幅图像的每个像素Ixy=[IRxy,IGxy,IBxy]T进行加上如下的变化:

[p1,p2,p3][α1λ1,α2λ2,α3λ3]T

其中:αi是满足均值为0,方差为0.1的随机变量.

代码实现

作为实现部分, 这里介绍一下在python 环境下, 利用已有的开源代码库Keras作为实践:

 1 # -*- coding: utf-8 -*-

 2 __author__ = 'Administrator'

 3

 4 # import packages

 5 from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

 6

 7 datagen = ImageDataGenerator(

 8         rotation_range=0.2,

 9         width_shift_range=0.2,

10         height_shift_range=0.2,

11         shear_range=0.2,

12         zoom_range=0.2,

13         horizontal_flip=True,

14         fill_mode='nearest')

15

16 img = load_img('C:\Users\Administrator\Desktop\dataA\lena.jpg')  # this is a PIL image, please replace to your own file path

17 x = img_to_array(img)  # this is a Numpy array with shape (3, 150, 150)

18 x = x.reshape((1,) + x.shape)  # this is a Numpy array with shape (1, 3, 150, 150)

19

20 # the .flow() command below generates batches of randomly transformed images

21 # and saves the results to the `preview/` directory

22

23 i = 0

24 for batch in datagen.flow(x,

25                           batch_size=1,

26                           save_to_dir='C:\Users\Administrator\Desktop\dataA\pre',#生成后的图像保存路径

27                           save_prefix='lena',

28                           save_format='jpg'):

29     i += 1

30     if i > 20:

31         break  # otherwise the generator would loop indefinitely

主要函数：ImageDataGenerator　实现了大多数上文中提到的图像几何变换方法．

rotation_range: 旋转范围, 随机旋转(0-180)度;
width_shift and height_shift: 随机沿着水平或者垂直方向，以图像的长宽小部分百分比为变化范围进行平移;
rescale: 对图像按照指定的尺度因子, 进行放大或缩小, 设置值在0 - 1之间，通常为1 / 255;
shear_range: 水平或垂直投影变换, 参考这里 https://keras.io/preprocessing/image/
zoom_range: 按比例随机缩放图像尺寸;
horizontal_flip: 水平翻转图像;
fill_mode: 填充像素, 出现在旋转或平移之后．

主要函数：ImageDataGenerator　实现了大多数上文中提到的图像几何变换方法．函数原型如下

 keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,

     samplewise_center=False,

     featurewise_std_normalization=False,

     samplewise_std_normalization=False,

     zca_whitening=False,

     rotation_range=0.,

     width_shift_range=0.,

     height_shift_range=0.,

     shear_range=0.,

     zoom_range=0.,

     channel_shift_range=0.,

     fill_mode='nearest',

     cval=0.,

     horizontal_flip=False,

     vertical_flip=False,

     rescale=None,

     dim_ordering=K.image_dim_ordering())

参数解释：

featurewise_center：布尔值，使输入数据集去中心化（均值为0）

samplewise_center：布尔值，使输入数据的每个样本均值为0

featurewise_std_normalization：布尔值，将输入除以数据集的标准差以完成标准化

samplewise_std_normalization：布尔值，将输入的每个样本除以其自身的标准差

zca_whitening：布尔值，对输入数据施加ZCA白化

rotation_range：整数，数据提升时图片随机转动的角度

width_shift_range：浮点数，图片宽度的某个比例，数据提升时图片水平偏移的幅度

height_shift_range：浮点数，图片高度的某个比例，数据提升时图片竖直偏移的幅度

shear_range：浮点数，剪切强度（逆时针方向的剪切变换角度）

zoom_range：浮点数或形如[lower,upper]的列表，随机缩放的幅度，若为浮点数，则相当于[lower,upper] = [1 - zoom_range, 1+zoom_range]

channel_shift_range：浮点数，随机通道偏移的幅度

fill_mode：；‘constant’，‘nearest’，‘reflect’或‘wrap’之一，当进行变换时超出边界的点将根据本参数给定的方法进行处理

cval：浮点数或整数，当fill_mode=constant时，指定要向超出边界的点填充的值

horizontal_flip：布尔值，进行随机水平翻转

vertical_flip：布尔值，进行随机竖直翻转

rescale: 重放缩因子,默认为None. 如果为None或0则不进行放缩,否则会将该数值乘到数据上(在应用其他变换之前)

dim_ordering：‘tf’和‘th’之一，规定数据的维度顺序。‘tf’模式下数据的形状为samples, width, height, channels，‘th’下形状为(samples, channels, width, height).该参数的默认值是Keras配置文件~/.keras/keras.json的image_dim_ordering值,如果你从未设置过的话,就是'th'

tensorflow中的部分数据增强

 import tensorflow as tf

 import cv2

 import numpy as np

 flags = tf.app.flags

 FLAGS = flags.FLAGS

 flags.DEFINE_boolean('random_flip_up_down', True, 'If uses flip')

 flags.DEFINE_boolean('random_flip_left_right', True, 'If uses flip')

 flags.DEFINE_boolean('random_brightness', True, 'If uses brightness')

 flags.DEFINE_boolean('random_contrast', True, 'If uses contrast')

 flags.DEFINE_boolean('random_saturation', True, 'If uses saturation')

 flags.DEFINE_integer('image_size', 224, 'image size.')

 """

 #flags examples

 flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')

 flags.DEFINE_integer('max_steps', 2000, 'Number of steps to run trainer.')

 flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.')

 flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data for unit testing.')

 """

 def pre_process(images):

     if FLAGS.random_flip_up_down:

     images = tf.image.random_flip_up_down(images)

     if FLAGS.random_flip_left_right:

     images = tf.image.random_flip_left_right(images)

     if FLAGS.random_brightness:

         images = tf.image.random_brightness(images, max_delta=0.3)

     if FLAGS.random_contrast:

         images = tf.image.random_contrast(images, 0.8, 1.2)

     if FLAGS.random_saturation:

     tf.image.random_saturation(images, 0.3, 0.5)

     new_size = tf.constant([FLAGS.image_size,FLAGS.image_size],dtype=tf.int32)

     images = tf.image.resize_images(images, new_size)

     return images

 raw_image = cv2.imread("004545.jpg")

 #image = tf.Variable(raw_image)

 image = tf.placeholder("uint8",[None,None,3])

 images = pre_process(image)

 with tf.Session() as session:

     result = session.run(images, feed_dict={image: raw_image})

 cv2.imshow("image",result.astype(np.uint8))

 cv2.waitKey(1000)

效果如下图所示：

2. 几种常使用的data augmentation方法总结

(1) Color Jittering：对颜色的数据增强：图像亮度、饱和度、对比度变化（此处对色彩抖动的理解不知是否得当）；

(2) PCA Jittering：首先按照RGB三个颜色通道计算均值和标准差，再在整个训练集上计算协方差矩阵，进行特征分解，得到特征向量和特征值，用来做PCA Jittering

(3) Random Scale：尺度变换；

(4) Random Crop：采用随机图像差值方式，对图像进行裁剪、缩放；包括Scale Jittering方法（VGG及ResNet模型使用）或者尺度和长宽比增强变换；

(5) Horizontal/Vertical Flip：水平/垂直翻转；

(6)5Shift：平移变换；

(7) Rotation/Reflection：旋转/仿射变换；

(8) Noise：高斯噪声、模糊处理；

(9) Label shuffle：类别不平衡数据的增广，参见海康威视ILSVRC2016的报告，里面提到了一种监督数据扩展方法（supervised data sugmentation).

参考来源：http://blog.csdn.net/mduanfire/article/details/51674098

https://zhuanlan.zhihu.com/p/23249000

paper 147：Deep Learning -- Face Data Augmentation（一）的更多相关文章

论文解读（SimGRACE）《SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation》
论文信息论文标题:SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation论文作者: ...
paper 149:Deep Learning 学习笔记（一）
1. 直接上手篇台湾李宏毅教授写的,<1天搞懂深度学习> slideshare的链接: http://www.slideshare.net/tw_dsconf/ss-62245351? ...
Paper | Xception: Deep Learning with Depthwise Separable Convolutions
目录故事 Inception结构和思想更进一步,以及现有的深度可分离卷积 Xception结构实验这篇论文写得很好.只要你知道卷积操作或公式,哪怕没看过Inception,也能看懂. 核心贡献 ...
【深度学习Deep Learning】资料大全
最近在学深度学习相关的东西,在网上搜集到了一些不错的资料,现在汇总一下: Free Online Books by Yoshua Bengio, Ian Goodfellow and Aaron C ...
机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 2)
##机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 2)---#####注:机器学习资料[篇目一](https://github.co ...
(转) Deep Learning Resources
转自:http://www.jeremydjacksonphd.com/category/deep-learning/ Deep Learning Resources Posted on May 13 ...
Why are very few schools involved in deep learning research? Why are they still hooked on to Bayesian methods?
Why are very few schools involved in deep learning research? Why are they still hooked on to Bayesia ...
What are some good books/papers for learning deep learning?
What's the most effective way to get started with deep learning? 29 Answers Yoshua Bengio, ...
New Machine Learning Server for Deep Learning in Nuke（翻译）
最近一直在开发Orchestra Pipeline System,歇两天翻译点文章换换气.这篇文章是无意间看到的,自己从2015年就开始关注机器学习在视效领域的应用了,也曾利用碎片时间做过一些算法移植 ...

随机推荐

四轴PID思路整理
参考资料: https://blog.csdn.net/nemol1990/article/details/45131603 https://blog.csdn.net/qq_27114397/art ...
docker-swarm笔记
1.部署环境: centos7 创建三节点的 swarm 集群 swarm-manager 是 manager node : 192.168.1.150 swarm-worker1 和 swarm-w ...
[HDU2294]Pendant
题目:Pendant 链接:http://acm.hdu.edu.cn/showproblem.php?pid=2294 分析: 1)f[i][j]表示长度为i,有j种珍珠的吊坠的数目. $f[i][ ...
前端每日实战：103# 视频演示如何用纯 CSS 创作一只监视眼
效果预览按下右侧的"点击预览"按钮可以在当前页面预览,点击链接可以全屏预览. https://codepen.io/comehope/pen/GBzLdy 可交互视频此视频是可 ...
信息安全-OAuth2.0：NuGetFromMicrosoft
ylbtech-信息安全-OAuth2.0:NuGetFromMicrosoft 1.返回顶部 1. https://login.microsoftonline.com/common/oauth2/v ...
hbase centos7 安装体验
1. 准备需要jdk1.8,hbase安装文件.大家可以官网下载. 解压文件,复制到指定目录 tar -zxvf jdk-8u201-linux-x64.tar.gz tar -zxvf hbas ...
Windows 08 R2_组策略
目录目录组策略组策略对象GPO 实验一组策略的计算机配置实验二组策略的用户配置实验三首选设置实验四组策略更改计算机桌面常用的组策略管理模块策略限制用户运行指定的Windows程序隐藏 ...
解惑结构体与结构体指针(struct与typedef struct在数据结构的第一道坎)
/* 数据结构解惑01 在数据结构中会看到 typedef struct QNode { QElemType data; //数据域 struct QNode *next; //指针域 }QNode ...
Codefores 507C Guess Your Way Out!（递归）
C. Guess Your Way Out! time limit per test 1 second memory limit per test 256 megabytes input standa ...
【目录】mysql 架构篇系列
随笔分类 - mysql 架构篇系列 mysql 架构篇系列 4 复制架构一主一从搭建(半同步复制) 摘要: 一.概述在mysql 5.5之前,mysql 的复制是异步操作,主库和从库的数据之间存在 ...

paper 147：Deep Learning -- Face Data Augmentation（一）

代码实现

paper 147：Deep Learning -- Face Data Augmentation（一）的更多相关文章

随机推荐

热门专题