Deep learning:四十七(Stochastic Pooling简单理解)
CNN中卷积完后有个步骤叫pooling, 在ICLR2013上,作者Zeiler提出了另一种pooling手段(最常见的就是mean-pooling和max-pooling),叫stochastic pooling,在他的文章还给出了效果稍差点的probability weighted pooling方法。
stochastic pooling方法非常简单,只需对feature map中的元素按照其概率值大小随机选择,即元素值大的被选中的概率也大。而不像max-pooling那样,永远只取那个最大值元素。
假设feature map中的pooling区域元素值如下:
3*3大小的,元素值和sum=0+1.1+2.5+0.9+2.0+1.0+0+1.5+1.0=10
方格中的元素同时除以sum后得到的矩阵元素为:
每个元素值表示对应位置处值的概率,现在只需要按照该概率来随机选一个,方法是:将其看作是9个变量的多项式分布,然后对该多项式分布采样即可,theano中有直接的multinomial()来函数完成。当然也可以自己用01均匀分布来采样,将单位长度1按照那9个概率值分成9个区间(概率越大,覆盖的区域越长,每个区间对应一个位置),然随机生成一个数后看它落在哪个区间。
比如如果随机采样后的矩阵为:
则这时候的poolng值为1.5
使用stochastic pooling时(即test过程),其推理过程也很简单,对矩阵区域求加权平均即可。比如对上面的例子求值过程为为:
0*0+1.1*0.11+2.5*0.25+0.9*0.09+2.0*0.2+1.0*0.1+0*0+1.5*0.15+1.0*0.1=1.625 说明此时对小矩形pooling后的结果为1.625.
在反向传播求导时,只需保留前向传播已经记录被选中节点的位置的值,其它值都为0,这和max-pooling的反向传播非常类似。
Stochastic pooling优点:
方法简单;
泛化能力更强;
可用于卷积层(文章中是与Dropout和DropConnect对比的,说是Dropout和DropConnect不太适合于卷积层. 不过个人感觉这没什么可比性,因为它们在网络中所处理的结构不同);
至于为什么stochastic pooling效果好,作者说该方法也是模型平均的一种,没怎么看懂。
关于Stochastic Pooling的前向传播过程和推理过程的代码可参考(没包括bp过程,所以代码中pooling选择的位置没有保存下来)
源码:pylearn2/stochastic_pool.py
"""
An implementation of stochastic max-pooling, based on Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
Matthew D. Zeiler, Rob Fergus, ICLR 2013
""" __authors__ = "Mehdi Mirza"
__copyright__ = "Copyright 2010-2012, Universite de Montreal"
__credits__ = ["Mehdi Mirza", "Ian Goodfellow"]
__license__ = "3-clause BSD"
__maintainer__ = "Mehdi Mirza"
__email__ = "mirzamom@iro" import numpy
import theano
from theano import tensor
from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
from theano.gof.op import get_debug_values def stochastic_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):
"""
Stochastic max pooling for training as defined in: Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
Matthew D. Zeiler, Rob Fergus bc01: minibatch in format (batch size, channels, rows, cols),
IMPORTANT: All values should be poitivie
pool_shape: shape of the pool region (rows, cols)
pool_stride: strides between pooling regions (row stride, col stride)
image_shape: avoid doing some of the arithmetic in theano
rng: theano random stream
"""
r, c = image_shape
pr, pc = pool_shape
rs, cs = pool_stride batch = bc01.shape[0] #总共batch的个数
channel = bc01.shape[1] #通道个数 if rng is None:
rng = RandomStreams(2022) # Compute index in pooled space of last needed pool
# (needed = each input pixel must appear in at least one pool)
def last_pool(im_shp, p_shp, p_strd):
rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))
assert p_strd * rval + p_shp >= im_shp
assert p_strd * (rval - 1) + p_shp < im_shp
return rval #表示pool过程中需要移动的次数
return T.dot(x, self._W) # Compute starting row of the last pool
last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0] #最后一个pool的起始位置
# Compute number of rows needed in image for all indexes to work out
required_r = last_pool_r + pr #满足上面pool条件时所需要image的高度 last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]
required_c = last_pool_c + pc # final result shape
res_r = int(numpy.floor(last_pool_r/rs)) + 1 #最后pool完成时图片的shape
res_c = int(numpy.floor(last_pool_c/cs)) + 1 for bc01v in get_debug_values(bc01):
assert not numpy.any(numpy.isinf(bc01v))
assert bc01v.shape[2] == image_shape[0]
assert bc01v.shape[3] == image_shape[1] # padding,如果不能整除移动,需要对原始图片进行扩充
padded = tensor.alloc(0.0, batch, channel, required_r, required_c)
name = bc01.name
if name is None:
name = 'anon_bc01'
bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)
bc01.name = 'zero_padded_' + name # unraveling
window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)
window.name = 'unravlled_winodows_' + name for row_within_pool in xrange(pool_shape[0]):
row_stop = last_pool_r + row_within_pool + 1
for col_within_pool in xrange(pool_shape[1]):
col_stop = last_pool_c + col_within_pool + 1
win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]
window = tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) #windows中装的是所有的pooling数据块 # find the norm
norm = window.sum(axis = [4, 5]) #求和当分母用
norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm) #如果norm为0,则将norm赋值为1
norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x') #除以norm得到每个位置的概率
# get prob
prob = rng.multinomial(pvals = norm.reshape((batch * channel * res_r * res_c, pr * pc)), dtype='float32') #multinomial()函数能够按照pvals产生多个多项式分布,元素值为0或1
# select
res = (window * prob.reshape((batch, channel, res_r, res_c, pr, pc))).max(axis=5).max(axis=4) #window和后面的矩阵相乘是点乘,即对应元素相乘,numpy矩阵符号
res.name = 'pooled_' + name return tensor.cast(res, theano.config.floatX) def weighted_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):
"""
This implements test time probability weighted pooling defined in: Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
Matthew D. Zeiler, Rob Fergus bc01: minibatch in format (batch size, channels, rows, cols),
IMPORTANT: All values should be poitivie
pool_shape: shape of the pool region (rows, cols)
pool_stride: strides between pooling regions (row stride, col stride)
image_shape: avoid doing some of the arithmetic in theano
"""
r, c = image_shape
pr, pc = pool_shape
rs, cs = pool_stride batch = bc01.shape[0]
channel = bc01.shape[1]
if rng is None: rng = RandomStreams(2022) # Compute index in pooled space of last needed pool # (needed = each input pixel must appear in at least one pool)
def last_pool(im_shp, p_shp, p_strd):
rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))
assert p_strd * rval + p_shp >= im_shp
assert p_strd * (rval - 1) + p_shp < im_shp
return rval
# Compute starting row of the last pool
last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0]
# Compute number of rows needed in image for all indexes to work out
required_r = last_pool_r + pr last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]
required_c = last_pool_c + pc # final result shape
res_r = int(numpy.floor(last_pool_r/rs)) + 1
res_c = int(numpy.floor(last_pool_c/cs)) + 1 for bc01v in get_debug_values(bc01):
assert not numpy.any(numpy.isinf(bc01v))
assert bc01v.shape[2] == image_shape[0]
assert bc01v.shape[3] == image_shape[1] # padding
padded = tensor.alloc(0.0, batch, channel, required_r, required_c)
name = bc01.name
if name is None:
name = 'anon_bc01'
bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)
bc01.name = 'zero_padded_' + name # unraveling
window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)
window.name = 'unravlled_winodows_' + name for row_within_pool in xrange(pool_shape[0]):
row_stop = last_pool_r + row_within_pool + 1
for col_within_pool in xrange(pool_shape[1]):
col_stop = last_pool_c + col_within_pool + 1
win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]
window = tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) # find the norm
norm = window.sum(axis = [4, 5])
norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm)
norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x')
# average
res = (window * norm).sum(axis=[4,5]) #前面的代码几乎和前向传播代码一样,这里只需加权求和即可
res.name = 'pooled_' + name return res.reshape((batch, channel, res_r, res_c))
参考资料:
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. Matthew D. Zeiler, Rob Fergus.
Deep learning:四十七(Stochastic Pooling简单理解)的更多相关文章
- Deep learning:四十六(DropConnect简单理解)
和maxout(maxout简单理解)一样,DropConnect也是在ICML2013上发表的,同样也是为了提高Deep Network的泛化能力的,两者都号称是对Dropout(Dropout简单 ...
- Deep learning:四十九(RNN-RBM简单理解)
前言: 本文主要是bengio的deep learning tutorial教程主页中最后一个sample:rnn-rbm in polyphonic music. 即用RNN-RBM来model复调 ...
- Deep learning:四十五(maxout简单理解)
maxout出现在ICML2013上,作者Goodfellow将maxout和dropout结合后,号称在MNIST, CIFAR-10, CIFAR-100, SVHN这4个数据上都取得了start ...
- salesforce 零基础学习(四十七) 数据加密简单介绍
对于一个项目来说,除了稳定性以及健壮性以外,还需要有较好的安全性,此篇博客简单描述salesforce中关于安全性的一点小知识,特别感谢公司中的nate大神和鹏哥让我学到了新得知识. 项目简单背景: ...
- Deep learning:三十四(用NN实现数据的降维)
数据降维的重要性就不必说了,而用NN(神经网络)来对数据进行大量的降维是从2006开始的,这起源于2006年science上的一篇文章:reducing the dimensionality of d ...
- Deep learning:三十八(Stacked CNN简单介绍)
http://www.cnblogs.com/tornadomeet/archive/2013/05/05/3061457.html 前言: 本节主要是来简单介绍下stacked CNN(深度卷积网络 ...
- Deep Learning 26:读论文“Maxout Networks”——ICML 2013
论文Maxout Networks实际上非常简单,只是发现一种新的激活函数(叫maxout)而已,跟relu有点类似,relu使用的max(x,0)是对每个通道的特征图的每一个单元执行的与0比较最大化 ...
- Deep Learning 10_深度学习UFLDL教程:Convolution and Pooling_exercise(斯坦福大学深度学习教程)
前言 理论知识:UFLDL教程和http://www.cnblogs.com/tornadomeet/archive/2013/04/09/3009830.html 实验环境:win7, matlab ...
- Deep learning:四十二(Denoise Autoencoder简单理解)
前言: 当采用无监督的方法分层预训练深度网络的权值时,为了学习到较鲁棒的特征,可以在网络的可视层(即数据的输入层)引入随机噪声,这种方法称为Denoise Autoencoder(简称dAE),由Be ...
随机推荐
- adapter.notifyDataSetChanged(); 没有反应
为什么是这样,以下是我总结的一些原因: 1.数据源没有更新,调用notifyDataSetChanged无效. 2.数据源更新了,但是它指向新的引用,调用notifyDataSetChanged无效. ...
- Hibernate Id Generator and Primary Key
Use automate id by hibernate: If you want the tables' id be created automation. How to do it? When u ...
- java中对List<Map<String,Object>>中的中文汉字排序
import java.text.Collator;import java.util.ArrayList;import java.util.Collections;import java.util.C ...
- [OLE DB 源 [1]] 警告: 无法从 OLE DB 访问接口检索列代码页信息。如果该组件支持“DefaultCodePage”属性,将使用来自该属性的代码页。如果当前的字符串代码页值不正确,请更改该属性的值。如果该组件不支持该属性,将使用来自该组件的区域设置 ID 的代码页。
SSIS的警告信息,虽然不影响使用,但是对于一个有强迫症的人来说,实在痛苦, 解决办法:控件右键--属性--AlaywayseUseDefaultCodePage 修改成True即可,默认为False
- 循环处理--sqlserver
alter PROCEDURE [dbo].[sp_gongzi] @gongzi_yf varchar(7) as DECLARE @input_id varchar(20)DECLARE @s ...
- 负载均衡算法(四)IP Hash负载均衡算法
/// <summary> /// IP Hash负载均衡算法 /// </summary> public static class IpHash { static Dicti ...
- ASP.NET 5 入门(1) - 建立和开发ASP.NET 5 项目
ASP.NET入门(1) - 建立和开发ASP.NET 5 项目 ASP.NET 5 理解和入门 使用自定义配置文件 建立项目 首先,目前只有VS 2015支持开发最新的ASP.NET 5 程序,所以 ...
- .NET Mvc Razor也可以这样玩!
忙碌的工作总是占据了生活的大部分的时间!所以我的博客到现在还是寥寥的几篇文章,技术是用来分享和学习的,对技术有不同的见解,大家都可以分享下,如果如下文章有问题之处请各位指出来,在这个闲下来的时间给大家 ...
- JedisPool异常Jedis链接处理
问题现象(jedis-2.1.0.jar) 基于JedisPool管理Jedis对象,通过get方法获取值,出现key对应的value值错误,例如: K V a a b b Jedis.get(“a” ...
- 从3D Touch 看 原生快速开发
全新的按压方式苹果继续为我们带来革命性的交互:Peek和Pop,Peek 和 Pop 让你能够预览所有类型的内容,甚至可对内容进行操作,却不必真的打开它们.例如,轻按屏幕,可用 Peek 预览收件箱中 ...