Deep learning：四十七(Stochastic Pooling简单理解)

　　CNN中卷积完后有个步骤叫pooling, 在ICLR2013上，作者Zeiler提出了另一种pooling手段(最常见的就是mean-pooling和max-pooling)，叫stochastic pooling，在他的文章还给出了效果稍差点的probability weighted pooling方法。

　　stochastic pooling方法非常简单，只需对feature map中的元素按照其概率值大小随机选择，即元素值大的被选中的概率也大。而不像max-pooling那样，永远只取那个最大值元素。

　　假设feature map中的pooling区域元素值如下：

　　3*3大小的，元素值和sum=0+1.1+2.5+0.9+2.0+1.0+0+1.5+1.0=10

　　方格中的元素同时除以sum后得到的矩阵元素为：

　　每个元素值表示对应位置处值的概率，现在只需要按照该概率来随机选一个，方法是：将其看作是9个变量的多项式分布，然后对该多项式分布采样即可，theano中有直接的multinomial()来函数完成。当然也可以自己用01均匀分布来采样，将单位长度1按照那9个概率值分成9个区间（概率越大，覆盖的区域越长，每个区间对应一个位置），然随机生成一个数后看它落在哪个区间。

　　比如如果随机采样后的矩阵为：

　　则这时候的poolng值为1.5

　　使用stochastic pooling时(即test过程)，其推理过程也很简单，对矩阵区域求加权平均即可。比如对上面的例子求值过程为为：

　 0*0+1.1*0.11+2.5*0.25+0.9*0.09+2.0*0.2+1.0*0.1+0*0+1.5*0.15+1.0*0.1=1.625 说明此时对小矩形pooling后的结果为1.625.

　　在反向传播求导时，只需保留前向传播已经记录被选中节点的位置的值，其它值都为0,这和max-pooling的反向传播非常类似。

　　Stochastic pooling优点：

　　方法简单;

　　泛化能力更强;

　　可用于卷积层（文章中是与Dropout和DropConnect对比的，说是Dropout和DropConnect不太适合于卷积层. 不过个人感觉这没什么可比性，因为它们在网络中所处理的结构不同）;

　　至于为什么stochastic pooling效果好，作者说该方法也是模型平均的一种，没怎么看懂。

　　关于Stochastic Pooling的前向传播过程和推理过程的代码可参考（没包括bp过程，所以代码中pooling选择的位置没有保存下来）

　　源码：pylearn2/stochastic_pool.py

"""

An implementation of stochastic max-pooling, based on

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

Matthew D. Zeiler, Rob Fergus, ICLR 2013

"""

__authors__ = "Mehdi Mirza"

__copyright__ = "Copyright 2010-2012, Universite de Montreal"

__credits__ = ["Mehdi Mirza", "Ian Goodfellow"]

__license__ = "3-clause BSD"

__maintainer__ = "Mehdi Mirza"

__email__ = "mirzamom@iro"

import numpy

import theano

from theano import tensor

from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams

from theano.gof.op import get_debug_values

def stochastic_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):

    """

    Stochastic max pooling for training as defined in:

    Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

    Matthew D. Zeiler, Rob Fergus

    bc01: minibatch in format (batch size, channels, rows, cols),

        IMPORTANT: All values should be poitivie

    pool_shape: shape of the pool region (rows, cols)

    pool_stride: strides between pooling regions (row stride, col stride)

    image_shape: avoid doing some of the arithmetic in theano

    rng: theano random stream

    """

    r, c = image_shape

    pr, pc = pool_shape

    rs, cs = pool_stride

    batch = bc01.shape[0] #总共batch的个数

    channel = bc01.shape[1] #通道个数

    if rng is None:

        rng = RandomStreams(2022)

    # Compute index in pooled space of last needed pool

    # (needed = each input pixel must appear in at least one pool)

    def last_pool(im_shp, p_shp, p_strd):

        rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))

        assert p_strd * rval + p_shp >= im_shp

        assert p_strd * (rval - 1) + p_shp < im_shp

        return rval #表示pool过程中需要移动的次数

        return T.dot(x, self._W)

    # Compute starting row of the last pool

    last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0] #最后一个pool的起始位置

    # Compute number of rows needed in image for all indexes to work out

    required_r = last_pool_r + pr #满足上面pool条件时所需要image的高度

    last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]

    required_c = last_pool_c + pc

    # final result shape

    res_r = int(numpy.floor(last_pool_r/rs)) + 1 #最后pool完成时图片的shape

    res_c = int(numpy.floor(last_pool_c/cs)) + 1

    for bc01v in get_debug_values(bc01):

        assert not numpy.any(numpy.isinf(bc01v))

        assert bc01v.shape[2] == image_shape[0]

        assert bc01v.shape[3] == image_shape[1]

    # padding,如果不能整除移动，需要对原始图片进行扩充

    padded = tensor.alloc(0.0, batch, channel, required_r, required_c)

    name = bc01.name

    if name is None:

        name = 'anon_bc01'

    bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)

    bc01.name = 'zero_padded_' + name

    # unraveling

    window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)

    window.name = 'unravlled_winodows_' + name

    for row_within_pool in xrange(pool_shape[0]):

        row_stop = last_pool_r + row_within_pool + 1

        for col_within_pool in xrange(pool_shape[1]):

            col_stop = last_pool_c + col_within_pool + 1

            win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]

            window  =  tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) #windows中装的是所有的pooling数据块

    # find the norm

    norm = window.sum(axis = [4, 5]) #求和当分母用

    norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm) #如果norm为0,则将norm赋值为1

    norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x') #除以norm得到每个位置的概率

    # get prob

    prob = rng.multinomial(pvals = norm.reshape((batch * channel * res_r * res_c, pr * pc)), dtype='float32') #multinomial()函数能够按照pvals产生多个多项式分布,元素值为0或1

    # select

    res = (window * prob.reshape((batch, channel, res_r, res_c,  pr, pc))).max(axis=5).max(axis=4) #window和后面的矩阵相乘是点乘，即对应元素相乘，numpy矩阵符号

    res.name = 'pooled_' + name

    return tensor.cast(res, theano.config.floatX)

def weighted_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):

    """

    This implements test time probability weighted pooling defined in:

    Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

    Matthew D. Zeiler, Rob Fergus

    bc01: minibatch in format (batch size, channels, rows, cols),

        IMPORTANT: All values should be poitivie

    pool_shape: shape of the pool region (rows, cols)

    pool_stride: strides between pooling regions (row stride, col stride)

    image_shape: avoid doing some of the arithmetic in theano

    """

    r, c = image_shape

    pr, pc = pool_shape

    rs, cs = pool_stride

    batch = bc01.shape[0]

    channel = bc01.shape[1]

    if rng is None: rng = RandomStreams(2022) # Compute index in pooled space of last needed pool # (needed = each input pixel must appear in at least one pool)

    def last_pool(im_shp, p_shp, p_strd):

        rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))

        assert p_strd * rval + p_shp >= im_shp

        assert p_strd * (rval - 1) + p_shp < im_shp

        return rval

    # Compute starting row of the last pool

    last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0]

    # Compute number of rows needed in image for all indexes to work out

    required_r = last_pool_r + pr

    last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]

    required_c = last_pool_c + pc

    # final result shape

    res_r = int(numpy.floor(last_pool_r/rs)) + 1

    res_c = int(numpy.floor(last_pool_c/cs)) + 1

    for bc01v in get_debug_values(bc01):

        assert not numpy.any(numpy.isinf(bc01v))

        assert bc01v.shape[2] == image_shape[0]

        assert bc01v.shape[3] == image_shape[1]

    # padding

    padded = tensor.alloc(0.0, batch, channel, required_r, required_c)

    name = bc01.name

    if name is None:

        name = 'anon_bc01'

    bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)

    bc01.name = 'zero_padded_' + name

    # unraveling

    window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)

    window.name = 'unravlled_winodows_' + name

    for row_within_pool in xrange(pool_shape[0]):

        row_stop = last_pool_r + row_within_pool + 1

        for col_within_pool in xrange(pool_shape[1]):

            col_stop = last_pool_c + col_within_pool + 1

            win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]

            window  =  tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell)

    # find the norm

    norm = window.sum(axis = [4, 5])

    norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm)

    norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x')

    # average

    res = (window * norm).sum(axis=[4,5]) #前面的代码几乎和前向传播代码一样，这里只需加权求和即可

    res.name = 'pooled_' + name

    return res.reshape((batch, channel, res_r, res_c))

　　参考资料：

　　Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. Matthew D. Zeiler, Rob Fergus.

pylearn2/stochastic_pool.py

Deep learning：四十七(Stochastic Pooling简单理解)的更多相关文章

Deep learning：四十六(DropConnect简单理解)
和maxout(maxout简单理解)一样,DropConnect也是在ICML2013上发表的,同样也是为了提高Deep Network的泛化能力的,两者都号称是对Dropout(Dropout简单 ...
Deep learning：四十九(RNN-RBM简单理解)
前言: 本文主要是bengio的deep learning tutorial教程主页中最后一个sample:rnn-rbm in polyphonic music. 即用RNN-RBM来model复调 ...
Deep learning：四十五(maxout简单理解)
maxout出现在ICML2013上,作者Goodfellow将maxout和dropout结合后,号称在MNIST, CIFAR-10, CIFAR-100, SVHN这4个数据上都取得了start ...
salesforce 零基础学习（四十七）数据加密简单介绍
对于一个项目来说,除了稳定性以及健壮性以外,还需要有较好的安全性,此篇博客简单描述salesforce中关于安全性的一点小知识,特别感谢公司中的nate大神和鹏哥让我学到了新得知识. 项目简单背景: ...
Deep learning：三十四(用NN实现数据的降维)
数据降维的重要性就不必说了,而用NN(神经网络)来对数据进行大量的降维是从2006开始的,这起源于2006年science上的一篇文章:reducing the dimensionality of d ...
Deep learning：三十八(Stacked CNN简单介绍)
http://www.cnblogs.com/tornadomeet/archive/2013/05/05/3061457.html 前言: 本节主要是来简单介绍下stacked CNN(深度卷积网络 ...
Deep Learning 26：读论文“Maxout Networks”——ICML 2013
论文Maxout Networks实际上非常简单,只是发现一种新的激活函数(叫maxout)而已,跟relu有点类似,relu使用的max(x,0)是对每个通道的特征图的每一个单元执行的与0比较最大化 ...
Deep Learning 10_深度学习UFLDL教程：Convolution and Pooling_exercise（斯坦福大学深度学习教程）
前言理论知识:UFLDL教程和http://www.cnblogs.com/tornadomeet/archive/2013/04/09/3009830.html 实验环境:win7, matlab ...
Deep learning：四十二(Denoise Autoencoder简单理解)
前言: 当采用无监督的方法分层预训练深度网络的权值时,为了学习到较鲁棒的特征,可以在网络的可视层(即数据的输入层)引入随机噪声,这种方法称为Denoise Autoencoder(简称dAE),由Be ...

随机推荐

多线程NSInvocationOperation(NSOperationQueue)的基本用法
#import "ViewController.h" @interface ViewController () @end @implementation ViewContr ...
数据库Date类型和JavaDate类型的转换
问题: java.lang.ClassCastException : java.util.Date cannot be cast to java.sql.Date 1.若是想将字符串装换成sq ...
html中使用js实现内容过长时部分
有时数据内容太长时我们并不希望其全部显示出来,因为这样可能会导致用于显示这些内容的标签被撑开影响美观. 这时就希望能够实现默认只显示部分内容,在鼠标放上去的时候再将全部的内容显示出来. 这里提供一个简 ...
小谈 - web模仿手机打电话与正则表达式
昨天遇到了一个很棘手的问题,就是手机端调用web端的页面,如果用编辑器插入的内容页面中有电话的的数据就要变一下格式,让手机端可以实现拨号的功能. 研究了半天就是没一点头绪,但是偶尔看到数据中每一个电话 ...
[教程]怎么用百度云观看和下载"磁力链接"无需下载直接观看.
1, 打开网址 http://okbt.net/ 输入你想要看的电影名字, 点搜索,鼠标右键点击拷贝磁力链接.或者电脑装了迅雷的话.可以直接点击.用迅雷下载. 磁力链接都是这种格式的.例: mag ...
一个不陌生的JS效果-marquee,用css3来实现
关于marquee,就不多说了,可以戳这里. 毕竟他是一个很古老的元素,现在的标准里头也不推荐使用这个标签了.但平时一些项目中会经常碰到这样的效果,每次都是重新写一遍,麻烦! JS类实现marquee ...
Unity3D逻辑热更新，第二代舒爽解决方案，L#使用简介
热更新天下武功,无坚不破,唯快不破热更新就是为了更快的把内容推到用户手中. 之前,我设计了C#Light,经过半年多的持续修补,勉强可用,磕磕绊绊.感谢那些,试过,骂过,用过的朋友,在你们的陪伴下 ...
js模版引擎handlebars.js实用教程——由于if功力不足引出的Helper
返回目录 <!DOCTYPE html> <html> <head> <META http-equiv=Content-Type content=" ...
Oracle函数脚本记录
--内置函数 --聚合函数返回单个值 '; --count()记录条数 select sum(degree) from score t; --sum(degree)求成绩总和 select avg( ...
FIR.im Weekly －劳动节我们也没有停下来
五一到五四的节假日对勤劳的开发者们似乎是零存在,各种干货好资源也并未因假期的到来而减少,所以这周的 Weekly 依然多产. Swift 样式指南:2015年4月更新这是 @开发技术前线收录的由 ...

Deep learning：四十七(Stochastic Pooling简单理解)

Deep learning：四十七(Stochastic Pooling简单理解)的更多相关文章

随机推荐

热门专题