data augmentation 总结

data augmentation 几种方法总结

在深度学习中，有的时候训练集不够多，或者某一类数据较少，或者为了防止过拟合，让模型更加鲁棒性，data augmentation是一个不错的选择。

常见方法

Color Jittering：对颜色的数据增强：图像亮度、饱和度、对比度变化（此处对色彩抖动的理解不知是否得当）；

PCA Jittering：首先按照RGB三个颜色通道计算均值和标准差，再在整个训练集上计算协方差矩阵，进行特征分解，得到特征向量和特征值，用来做PCA Jittering；

Random Scale：尺度变换；

Random Crop：采用随机图像差值方式，对图像进行裁剪、缩放；包括Scale Jittering方法（VGG及ResNet模型使用）或者尺度和长宽比增强变换；

Horizontal/Vertical Flip：水平/垂直翻转；

Shift：平移变换；

Rotation/Reflection：旋转/仿射变换；

Noise：高斯噪声、模糊处理；

Label shuffle：类别不平衡数据的增广，参见海康威视ILSVRC2016的report；另外，文中提出了一种Supervised Data Augmentation方法，有兴趣的朋友的可以动手实验下。

部分方法的具体实现

# -*- coding:utf-8 -*-

"""数据增强

   1. 翻转变换 flip

   2. 随机修剪 random crop

   3. 色彩抖动 color jittering

   4. 平移变换 shift

   5. 尺度变换 scale

   6. 对比度变换 contrast

   7. 噪声扰动 noise

   8. 旋转变换/反射变换 Rotation/reflection

"""

from PIL import Image, ImageEnhance, ImageOps, ImageFile

import numpy as np

import random

import threading, os, time

import logging

logger = logging.getLogger(__name__)

ImageFile.LOAD_TRUNCATED_IMAGES = True

class DataAugmentation:

    """

    包含数据增强的八种方式

    """

    def __init__(self):

        pass

    @staticmethod

    def openImage(image):

        return Image.open(image, mode="r")

    @staticmethod

    def randomRotation(image, mode=Image.BICUBIC):

        """

         对图像进行随机任意角度(0~360度)旋转

        :param mode 邻近插值,双线性插值,双三次B样条插值(default)

        :param image PIL的图像image

        :return: 旋转转之后的图像

        """

        random_angle = np.random.randint(1, 360)

        return image.rotate(random_angle, mode)

    @staticmethod

    def randomCrop(image):

        """

        对图像随意剪切,考虑到图像大小范围(68,68),使用一个一个大于(36*36)的窗口进行截图

        :param image: PIL的图像image

        :return: 剪切之后的图像

        """

        image_width = image.size[0]

        image_height = image.size[1]

        crop_win_size = np.random.randint(40, 68)

        random_region = (

            (image_width - crop_win_size) >> 1, (image_height - crop_win_size) >> 1, (image_width + crop_win_size) >> 1,

            (image_height + crop_win_size) >> 1)

        return image.crop(random_region)

    @staticmethod

    def randomColor(image):

        """

        对图像进行颜色抖动

        :param image: PIL的图像image

        :return: 有颜色色差的图像image

        """

        random_factor = np.random.randint(0, 31) / 10.  # 随机因子

        color_image = ImageEnhance.Color(image).enhance(random_factor)  # 调整图像的饱和度

        random_factor = np.random.randint(10, 21) / 10.  # 随机因子

        brightness_image = ImageEnhance.Brightness(color_image).enhance(random_factor)  # 调整图像的亮度

        random_factor = np.random.randint(10, 21) / 10.  # 随机因1子

        contrast_image = ImageEnhance.Contrast(brightness_image).enhance(random_factor)  # 调整图像对比度

        random_factor = np.random.randint(0, 31) / 10.  # 随机因子

        return ImageEnhance.Sharpness(contrast_image).enhance(random_factor)  # 调整图像锐度

    @staticmethod

    def randomGaussian(image, mean=0.2, sigma=0.3):

        """

         对图像进行高斯噪声处理

        :param image:

        :return:

        """

        def gaussianNoisy(im, mean=0.2, sigma=0.3):

            """

            对图像做高斯噪音处理

            :param im: 单通道图像

            :param mean: 偏移量

            :param sigma: 标准差

            :return:

            """

            for _i in range(len(im)):

                im[_i] += random.gauss(mean, sigma)

            return im

        # 将图像转化成数组

        img = np.asarray(image)

        img.flags.writeable = True  # 将数组改为读写模式

        width, height = img.shape[:2]

        img_r = gaussianNoisy(img[:, :, 0].flatten(), mean, sigma)

        img_g = gaussianNoisy(img[:, :, 1].flatten(), mean, sigma)

        img_b = gaussianNoisy(img[:, :, 2].flatten(), mean, sigma)

        img[:, :, 0] = img_r.reshape([width, height])

        img[:, :, 1] = img_g.reshape([width, height])

        img[:, :, 2] = img_b.reshape([width, height])

        return Image.fromarray(np.uint8(img))

    @staticmethod

    def saveImage(image, path):

        image.save(path)

def makeDir(path):

    try:

        if not os.path.exists(path):

            if not os.path.isfile(path):

                # os.mkdir(path)

                os.makedirs(path)

            return 0

        else:

            return 1

    except Exception, e:

        print str(e)

        return -2

def imageOps(func_name, image, des_path, file_name, times=5):

    funcMap = {"randomRotation": DataAugmentation.randomRotation,

               "randomCrop": DataAugmentation.randomCrop,

               "randomColor": DataAugmentation.randomColor,

               "randomGaussian": DataAugmentation.randomGaussian

               }

    if funcMap.get(func_name) is None:

        logger.error("%s is not exist", func_name)

        return -1

    for _i in range(0, times, 1):

        new_image = funcMap[func_name](image)

        DataAugmentation.saveImage(new_image, os.path.join(des_path, func_name + str(_i) + file_name))

opsList = {"randomRotation", "randomCrop", "randomColor", "randomGaussian"}

def threadOPS(path, new_path):

    """

    多线程处理事务

    :param src_path: 资源文件

    :param des_path: 目的地文件

    :return:

    """

    if os.path.isdir(path):

        img_names = os.listdir(path)

    else:

        img_names = [path]

    for img_name in img_names:

        print img_name

        tmp_img_name = os.path.join(path, img_name)

        if os.path.isdir(tmp_img_name):

            if makeDir(os.path.join(new_path, img_name)) != -1:

                threadOPS(tmp_img_name, os.path.join(new_path, img_name))

            else:

                print 'create new dir failure'

                return -1

                # os.removedirs(tmp_img_name)

        elif tmp_img_name.split('.')[1] != "DS_Store":

            # 读取文件并进行操作

            image = DataAugmentation.openImage(tmp_img_name)

            threadImage = [0] * 5

            _index = 0

            for ops_name in opsList:

                threadImage[_index] = threading.Thread(target=imageOps,

                                                       args=(ops_name, image, new_path, img_name,))

                threadImage[_index].start()

                _index += 1

                time.sleep(0.2)

if __name__ == '__main__':

    threadOPS("/home/pic-image/train/12306train",

              "/home/pic-image/train/12306train3")

参考文献

深度学习之图像的数据增强

 知乎

data augmentation 总结的更多相关文章

深度学习中的Data Augmentation方法（转）基于keras
在深度学习中,当数据量不够大时候,常常采用下面4中方法: 1. 人工增加训练集的大小. 通过平移, 翻转, 加噪声等方法从已有数据中创造出一批"新"的数据.也就是Data Augm ...
常见的数据扩充（data augmentation）方法
G~L~M~R~S 一.data augmentation 常见的数据扩充(data augmentation)方法:文中图片均来自吴恩达教授的deeplearning.ai课程 1.Mirrorin ...
（转）AutoML for Data Augmentation
AutoML for Data Augmentation 2019-04-01 09:26:19 This blog is copied from: https://blog.insightdatas ...
图像数据增强 (Data Augmentation in Computer Vision)
1.1 简介深层神经网络一般都需要大量的训练数据才能获得比较理想的结果.在数据量有限的情况下,可以通过数据增强(Data Augmentation)来增加训练样本的多样性, 提高模型鲁棒性,避免过拟 ...
Keras Data augmentation(数据扩充)
在深度学习中,我们经常需要用到一些技巧(比如将图片进行旋转,翻转等)来进行data augmentation, 来减少过拟合. 在本文中,我们将主要介绍如何用深度学习框架keras来自动的进行data ...
keras对图像数据进行增强 | keras data augmentation
本文首发于个人博客https://kezunlin.me/post/8db507ff/,欢迎阅读最新内容! keras data augmentation Guide code # import th ...
paper 147：Deep Learning -- Face Data Augmentation（一）
1. 在深度学习中,当数据量不够大时候,常常采用下面4中方法: (1)人工增加训练集的大小. 通过平移, 翻转, 加噪声等方法从已有数据中创造出一批"新"的数据.也就是Data ...
【48】数据扩充（Data augmentation）
数据扩充(Data augmentation) 大部分的计算机视觉任务使用很多的数据,所以数据扩充是经常使用的一种技巧来提高计算机视觉系统的表现.我认为计算机视觉是一个相当复杂的工作,你需要输入图像的 ...
Regularizing Deep Networks with Semantic Data Augmentation
目录概主要内容代码 Wang Y., Huang G., Song S., Pan X., Xia Y. and Wu C. Regularizing Deep Networks with Se ...

随机推荐

Java RSA加密以及验签
签名加密以及验签工具类: 一般秘钥分为3个key 1.自己生成的私钥, 2.通过私钥生成的公钥1 3.通过提交公钥1给某宝,获取的公钥2. RSA公钥加密算法简介非对称加密算法.只有短的RSA钥匙才 ...
如何阅读不同格式的Ubuntu/Linux帮助文档
Ubuntu和Linux的帮助文档有各种各样的格式,下面简单说下如何阅读这些帮助文档. 1)通过man命令阅读软件或命令的manual page.例如阅读man命令的manual页面可使用如下命令: ...
Django - 请求与响应、表单、中间件、上下文处理器
请求与响应篇一.HttpRequest对象服务器接收到http协议的请求后,会根据报文创建HttpRequest对象.视图函数的第一个参数(request)是HttpRequest对象在djang ...
linux的%用法
转自:http://blog.csdn.net/wu020708/article/details/52387473 linux (%和%%)(#和##)贪婪匹配规则先看一个案例,提取文件名: fil ...
微软Build 2017开发者大会午夜趴
时间:2017年5月10号半夜地点:微软中关村会议室一年一度的Build大会,微软今年特地组织了一波粉丝到“现场”远程观摩keynote直播,同时在新浪直播间里也有相应的专家进行同步翻译和讲(tu ...
delphi 事件记录
delphi常用事件序号事件描述 1. OnActive 焦点称到窗体或控件时发生 2. OnClick 鼠标单击事件 3. OnDbClick 鼠标双击事件 4. OnClose和OnClos ...
ubuntu 安装低版本firefox
firefox 57以后很多插件不支持了,ubuntu16自带火狐版本59,想换回56. 1.下载想换回的版本 https://ftp.mozilla.org/pub/mozilla.org/fire ...
Java 语言基础之运算符
使用运算符之后,肯定有返回结果. 六种运算符: 算术运算符赋值运算符比较运算符逻辑运算符位运算符三元运算符 1. 算术运算符加(+), 减(-), 乘(*), 除(/), 取余(%), 自 ...
Ionic的下拉框在手机上点击无效
最近在维护ionic+angular的项目,在浏览器使用下拉框的时候调试的时候,一切正常. 但是在手机上测试的时候,遇到这个问题. 我使用的版本是ionic1.3.1,不知道新版本有没有解决这个bug ...
WTForms In Flask(WTForms在Flask中的应用)
WTForms WTForms是一个支持多个web框架的form组件,主要用于对用户请求数据进行验证. 安装wtforms : pip3/pip install wtforms 用户登录/注册示例项 ...