theano scan optimization
selected from Theano Doc
Optimizing Scan
performance
Minimizing Scan Usage
performan as much of the computation as possible outside of Scan
. This may have the effect increasing memory usage but also reduce the overhead introduce by Scan
.
Explicitly passing inputs of the inner function to scan
It's more efficient to explicitly pass parameter as non-sequence inputs.
Examples: Gibbs Sampling
Version One:
import theano
from theano import tensor as T
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
def OneStep(vsample) :
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
Version Two:
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
# OneStep, with explicit use of the shared variables (W, bvis, bhid)
def OneStep(vsample, W, bvis, bhid):
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
# The new scan, with the shared variables passed as non_sequences
values, updates = theano.scan(fn=OneStep,
outputs_info=sample,
non_sequences=[W, bvis, bhid],
n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
Deactivating garbage collecting in Scan
Deactivating garbage collecting in Scan can allow it to reuse memory between executins instead of always having to allocate new memory. Scan
reuses memory between iterations of the same execution but frees the memory after the last iteration.
config.scan.allow_gc=False
Graph Optimizations
There are patterns that Theano can't optimize. the LSTM tutorial provides an example of optimization that theano can't perform. Instead of performing many matrix multiplications between matrix \(x_t\) and each of the shared msatrices \(W_i,W_c,W_f\) and \(W_o\), the matrixes \(W_{*}\) are merged into a single shared \(W\) and the graph performans a single larger matrix multiplication between \(W\) and \(x_t\). The resulting matrix is then sliced to obtain the results of that the small individial matrix multiplications by a single larger one and thus improves performance at the cost of a potentially higher memory usage.
theano scan optimization的更多相关文章
- theano中的scan用法
scan函数是theano中的循环函数,相当于for loop.在读别人的代码时第一次看到,有点迷糊,不知道输入.输出怎么定义,网上也很少有example,大多数都是相互转载同一篇.所以,还是要看官方 ...
- Theano学习-scan循环
\(1.Scan\) 通用的一般形式,可用于循环 减少和映射(对维数循环)是特殊的 \(scan\) 对输入序列进行 \(scan\) 操作,每一步都能得到一个输出 \(scan\) 能看到定义函数的 ...
- theano学习
import numpy import theano.tensor as T from theano import function x = T.dscalar('x') y = T.dscalar( ...
- LSTM 分类器笔记及Theano实现
相关讨论 http://tieba.baidu.com/p/3960350008 基于教程http://deeplearning.net/tutorial/lstm.html LSTM基本原理http ...
- 关于thenao.scan() fn函数参数的说明
theano.scan()原型: theano.scan( fn, sequences=None, outputs_info=None, non_sequences=None, n_steps=Non ...
- Theano学习-梯度计算
1. 计算梯度 创建一个函数 \(y\) ,并且计算关于其参数 \(x\) 的微分. 为了实现这一功能,将使用函数 \(T.grad\) . 例如:计算 \(x^2\) 关于参数 \(x\) 的梯度. ...
- IMPLEMENTING A GRU/LSTM RNN WITH PYTHON AND THEANO - 学习笔记
catalogue . 引言 . LSTM NETWORKS . LSTM 的变体 . GRUs (Gated Recurrent Units) . IMPLEMENTATION GRUs 0. 引言 ...
- theano安装问题
WARNING (theano.configdefaults): g++ not available, if using conda: `conda install m2w64-toolchain` ...
- theano使用
一 theano内置数据类型 只有thenao.shared()类型才有get_value()成员函数(返回numpy.ndarray)? 1. 惯常处理 x = T.matrix('x') # t ...
随机推荐
- GitHub Pages 绑定二级域名
Updated: 2016.06.22 网上搜出一大把,很多还是重复转载的文章,关键是步骤很麻烦,比如:要注册 DNSPod,要 Ping IP(感觉不靠谱,IP是可以变的). 后来看了官方帮助,其它 ...
- js类型转换
1.js中有六种基本类型,分别是object.number.string.Boolean.null.undefined,其中number.string.Boolean为基本类型,有时使用会强制转换成对 ...
- 全国SHP地图数据赠送
百度搜索:GIS之家获取全国SHP图层数据的方式:收藏(ArcGIS地图全国电子地图shp格式版本GIS地图数据.GIS开发顺德政府GIS公共服务共享平台),并且截图验证,验证通过后,收下邮箱,我把下 ...
- [转]通过Visual Studio为Linux编写C++代码
Build 2016大会上Microsoft首次公布的Visual Studio 2015扩展提供了在VS2015中编写C++代码,随后通过Linux/UNIX计算机进行编译和执行的能力.这种想法非常 ...
- 看完你也能独立负责项目!产品经理做APP从头到尾的所有工作流程详解!
(一)项目启动前 从事产品的工作一年多,但自己一直苦于这样或者那样的困惑,很多人想要从事产品,或者老板自己创业要亲自承担产品一职,但他们对产品这个岗位的认识却不明晰,有的以为是纯粹的画原型,有的是以为 ...
- iOS10 适配问题-Xcode8
前段时间升级了Xcode8,整体来说对OC的影响不大,但是还是跳一个坑,消耗了不少时间.这里总结下遇到的适配问题. 1.权限问题 Xcode8 访问相机.相册等需要权限的地方崩溃 解决办法: 在使用私 ...
- Android InputType详解
android:inputType 如果设置android:inputType = "number",则默认弹出的输入键盘为数字键盘,且输入的内容只能为数字. InputType文 ...
- 【代码笔记】iOS-UILable电子表显示
一,效果图. 二,代码. RootViewController.h #import <UIKit/UIKit.h> @interface RootViewController : UIVi ...
- [MySQL Reference Manual] 24 MySQL sys框架
24 MySQL sys框架 24 MySQL sys框架 24.1 sys框架的前提条件 24.2 使用sys框架 24.3 sys框架进度报告 24.4 sys框架的对象 24.4.1所有sys下 ...
- 原: 安装VMtools过程流水帐
以下基于 vm12.0.0 1. 一定要打开虚拟机的 cd设置 2. 然后 cd '/medal/VMware tools ' (注意一定要加 '', 因为VMware tools 有空格) ...