Optimizing `Scan` performance

Minimizing Scan Usage

performan as much of the computation as possible outside of Scan. This may have the effect increasing memory usage but also reduce the overhead introduce by Scan.

Explicitly passing inputs of the inner function to scan

It's more efficient to explicitly pass parameter as non-sequence inputs.

Examples: Gibbs Sampling

Version One:

import theano

from theano import tensor as T

W = theano.shared(W_values) # we assume that ``W_values`` contains the

                            # initial values of your weight matrix

bvis = theano.shared(bvis_values)

bhid = theano.shared(bhid_values)

trng = T.shared_randomstreams.RandomStreams(1234)

def OneStep(vsample) :

    hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)

    hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)

    vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)

    return trng.binomial(size=vsample.shape, n=1, p=vmean,

                         dtype=theano.config.floatX)

sample = theano.tensor.vector()

values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)

gibbs10 = theano.function([sample], values[-1], updates=updates)

Version Two:

W = theano.shared(W_values) # we assume that ``W_values`` contains the

                            # initial values of your weight matrix

bvis = theano.shared(bvis_values)

bhid = theano.shared(bhid_values)

trng = T.shared_randomstreams.RandomStreams(1234)

# OneStep, with explicit use of the shared variables (W, bvis, bhid)

def OneStep(vsample, W, bvis, bhid):

    hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)

    hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)

    vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)

    return trng.binomial(size=vsample.shape, n=1, p=vmean,

                     dtype=theano.config.floatX)

sample = theano.tensor.vector()

# The new scan, with the shared variables passed as non_sequences

values, updates = theano.scan(fn=OneStep,

                              outputs_info=sample,

                              non_sequences=[W, bvis, bhid],

                              n_steps=10)

gibbs10 = theano.function([sample], values[-1], updates=updates)

Deactivating garbage collecting in Scan

Deactivating garbage collecting in Scan can allow it to reuse memory between executins instead of always having to allocate new memory. Scan reuses memory between iterations of the same execution but frees the memory after the last iteration.

config.scan.allow_gc=False

Graph Optimizations

There are patterns that Theano can't optimize. the LSTM tutorial provides an example of optimization that theano can't perform. Instead of performing many matrix multiplications between matrix \(x_t\) and each of the shared msatrices \(W_i,W_c,W_f\) and \(W_o\), the matrixes \(W_{*}\) are merged into a single shared \(W\) and the graph performans a single larger matrix multiplication between \(W\) and \(x_t\). The resulting matrix is then sliced to obtain the results of that the small individial matrix multiplications by a single larger one and thus improves performance at the cost of a potentially higher memory usage.

theano scan optimization的更多相关文章

theano中的scan用法
scan函数是theano中的循环函数,相当于for loop.在读别人的代码时第一次看到,有点迷糊,不知道输入.输出怎么定义,网上也很少有example,大多数都是相互转载同一篇.所以,还是要看官方 ...
Theano学习-scan循环
\(1.Scan\) 通用的一般形式,可用于循环减少和映射(对维数循环)是特殊的 \(scan\) 对输入序列进行 \(scan\) 操作,每一步都能得到一个输出 \(scan\) 能看到定义函数的 ...
theano学习
import numpy import theano.tensor as T from theano import function x = T.dscalar('x') y = T.dscalar( ...
LSTM 分类器笔记及Theano实现
相关讨论 http://tieba.baidu.com/p/3960350008 基于教程http://deeplearning.net/tutorial/lstm.html LSTM基本原理http ...
关于thenao.scan() fn函数参数的说明
theano.scan()原型: theano.scan( fn, sequences=None, outputs_info=None, non_sequences=None, n_steps=Non ...
Theano学习-梯度计算
1. 计算梯度创建一个函数 \(y\) ,并且计算关于其参数 \(x\) 的微分. 为了实现这一功能,将使用函数 \(T.grad\) . 例如:计算 \(x^2\) 关于参数 \(x\) 的梯度. ...
IMPLEMENTING A GRU/LSTM RNN WITH PYTHON AND THEANO - 学习笔记
catalogue . 引言 . LSTM NETWORKS . LSTM 的变体 . GRUs (Gated Recurrent Units) . IMPLEMENTATION GRUs 0. 引言 ...
theano安装问题
WARNING (theano.configdefaults): g++ not available, if using conda: `conda install m2w64-toolchain` ...
theano使用
一 theano内置数据类型只有thenao.shared()类型才有get_value()成员函数(返回numpy.ndarray)? 1. 惯常处理 x = T.matrix('x') # t ...

随机推荐

Bootstrap 之 Carousel
Bootstrap 轮播(Carousel)插件是一种灵活的响应式的向站点添加滑块的方式.除此之外,内容也是足够灵活的,可以是图像.内嵌框架.视频或者其他您想要放置的任何类型的内容. 如果您想要单独引 ...
如何在SharePoint 当中使用纯JSOM上传任意二进制文件（小于2MＢ）
在微软的官方网站上有关于如何在SharePoint当中使用JS创建一个简单的文本文件的例子,经过我的思考我觉得结合Html5特性的浏览器,是完全可以通过JS来读取到文件的内容的(这一部分的内容请大家自 ...
Atitit 自然语言处理原理与实现 attilax总结
Atitit 自然语言处理原理与实现 attilax总结 1.1. 中文分词原理与实现 111 1.2. 英文分析 1941 1.3. 第6章信息提取 2711 1.4. 第7章自动摘要 3041 ...
iOS 性能调试
性能调优的方式: 1.通过专门的性能调优工具 2.通过代码优化 1. 性能调优工具: 下面针对iOS的性能调优工具进行一个介绍: 1.1 静态分析工具–Analyze 相信iOS开发者在App进行Bu ...
An App ID with Identifier 'com.XXX.XXX’ is not available. Please enter a different string.报错
昨天刚刚升的Xcode7.3和iOS9.3,然后没怎么使用这两样就下班了,但是今天早上来了之后,就发现突然之间不能真机测试和运行代码了,一看才发现xcode报错: An App ID with Ide ...
plist文件、NSUserDefault 对文件进行存储的类、json格式解析
========================== 文件操作 ========================== Δ一 .plist文件 .plist文件是一个属性字典数组的一个文件: .plis ...
ASP.NET MVC 3 网站优化总结(三)Specify Vary: Accept-Encoding header
继续进行 ASP.NET MVC 3 网站优化工作,使用 Google Page 检测发现提示 You should Specify Vary: Accept-Encoding header,The ...
spark dataframe 类型转换
读一张表,对其进行二值化特征转换.可以二值化要求输入类型必须double类型,类型怎么转换呢? 直接利用spark column 就可以进行转换: DataFrame dataset = hive.s ...
[Top-Down Approach] Chatper 3 Notes
这里留下空白,提醒自己,第一章第二章尚待整理回顾. 此处缺了3.6/3.7两节拥塞控制的内容
grep 命令过滤配置文件中的注释和空行
grep 用法 Usage: grep [OPTION]... PATTERN [FILE]... Search for PATTERN in each FILE or standard input. ...

theano scan optimization

Optimizing Scan performance