Reducing the Dimensionality of data with neural networks / A fast learing algorithm for deep belief net
Deeplearning原文作者Hinton代码注解
Matlab示例代码为两部分,分别对应不同的论文: . Reducing the Dimensionality of data with neural networks ministdeepauto.m backprop.m rbmhidlinear.m . A fast learing algorithm for deep belief net mnistclassify.m backpropclassfy.m 其余部分代码通用。 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
mnistclassify.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% clear all
close all maxepoch=; %迭代次数
numhid=; numpen=; numpen2=; fprintf(,'Converting Raw files into Matlab format \n');
converter; fprintf(,'Pretraining a deep autoencoder. \n');
fprintf(,'The Science paper used 50 epochs. This uses %3i \n', maxepoch); makebatches;%分批数据
[numcases numdims numbatches]=size(batchdata); %获取batchdata数据大小
%%numcases 每批数据的个数
%%numdims 数据元组的维度
%%numbtches 数据批数 fprintf(,'Pretraining Layer 1 with RBM: %d-%d \n',numdims,numhid);%图像输入层到第一个隐藏层
restart=; %设置初始化参数
rbm; %调用RBM训练数据
hidrecbiases=hidbiases; %获取隐藏层偏置值
save mnistvhclassify vishid hidrecbiases visbiases; % fprintf(,'\nPretraining Layer 2 with RBM: %d-%d \n',numhid,numpen);%第一个隐藏层到第二个隐藏层
batchdata=batchposhidprobs; %上一个RBM的隐藏层输出,读入作为这个RBM的输入
numhid=numpen;%设置隐藏层的节点数,输入的节点数已经由读入数据给出
restart=;
rbm;
hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases; %同上,提取权值,偏置,
save mnisthpclassify hidpen penrecbiases hidgenbiases; fprintf(,'\nPretraining Layer 3 with RBM: %d-%d \n',numpen,numpen2);%第二个隐藏层到第三层隐藏层,其余同上
batchdata=batchposhidprobs;
numhid=numpen2;
restart=;
rbm;
hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases;
save mnisthp2classify hidpen2 penrecbiases2 hidgenbiases2; backpropclassify; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
backpropclassify.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
maxepoch=;
fprintf(,'\nTraining discriminative model on MNIST by minimizing cross entropy error. \n');%最小化交叉熵
fprintf(,'60 batches of 1000 cases each. \n'); load mnistvhclassify%加载各层之间的权值,以及偏置
load mnisthpclassify
load mnisthp2classify makebatches;%分批数据
[numcases numdims numbatches]=size(batchdata);
N=numcases; %获取每批数据向量数 %%%% PREINITIALIZE WEIGHTS OF THE DISCRIMINATIVE MODEL%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% w1=[vishid; hidrecbiases];%第一层到第二层的权重,以及第二层的偏置
w2=[hidpen; penrecbiases];%类上
w3=[hidpen2; penrecbiases2];%类上
w_class = 0.1*randn(size(w3,)+,);%随机生成第四层列数+1行,10列的矩阵
%%%%%%%%%% END OF PREINITIALIZATIO OF WEIGHTS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% l1=size(w1,)-;%获取每层的单元个数
l2=size(w2,)-;
l3=size(w3,)-;
l4=size(w_class,)-;%最高层的单元个数
l5=; %label层单元个数
test_err=[];%
train_err=[];% for epoch = :maxepoch %%%%%%%%%%%%%%%%%%%% COMPUTE TRAINING MISCLASSIFICATION ERROR %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
err=;
err_cr=;
counter=;
[numcases numdims numbatches]=size(batchdata);
%%numcases 每批数据的个数
%%numdims 数据元组的维度
%%numbtches 数据批数
N=numcases;%%每批次数据向量个数
for batch = :numbatches
data = [batchdata(:,:,batch)];%读取一批次数据
target = [batchtargets(:,:,batch)];%读取当前批次的目标值
data = [data ones(N,)];%在原数据后添加N行1列数据
w1probs = ./( + exp(-data*w1)); w1probs = [w1probs ones(N,)];%sigmod计算各层的概率值,参见BP算法
w2probs = ./( + exp(-w1probs*w2)); w2probs = [w2probs ones(N,)];
w3probs = ./( + exp(-w2probs*w3)); w3probs = [w3probs ones(N,)]; targetout = exp(w3probs*w_class);%计算最后的输出值N行10列
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%对最后的label的输出处理过程,见公式6.,其中w3probs*w_class是label的输入
%最后只能有一个单元被激活,激活单元的选择即通过下面计算得出的概率来进行选择
%10个单元组成的“softmax”组
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
targetout = targetout./repmat(sum(targetout,),,);%计算最后10个label输出除以输出值的总和 [I J]=max(targetout,[],);%取计算结果每行中的最大值,以及其列标
[I1 J1]=max(target,[],);%取原先设定目标值的最大值以及列标
counter=counter+length(find(J==J1));%统计正确的条数
err_cr = err_cr- sum(sum( target(:,:end).*log(targetout))) ; %%%%????
end
train_err(epoch)=(numcases*numbatches-counter);%总的错误条数???
train_crerr(epoch)=err_cr/numbatches;%平均每批次错误率??? %%%%%%%%%%%%%% END OF COMPUTING TRAINING MISCLASSIFICATION ERROR %%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%% COMPUTE TEST MISCLASSIFICATION ERROR %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
err=;
err_cr=;
counter=;
[testnumcases testnumdims testnumbatches]=size(testbatchdata); N=testnumcases;
for batch = :testnumbatches
data = [testbatchdata(:,:,batch)];
target = [testbatchtargets(:,:,batch)];
data = [data ones(N,)];
w1probs = ./( + exp(-data*w1)); w1probs = [w1probs ones(N,)];
w2probs = ./( + exp(-w1probs*w2)); w2probs = [w2probs ones(N,)];
w3probs = ./( + exp(-w2probs*w3)); w3probs = [w3probs ones(N,)];
targetout = exp(w3probs*w_class);
targetout = targetout./repmat(sum(targetout,),,); [I J]=max(targetout,[],);
[I1 J1]=max(target,[],);
counter=counter+length(find(J==J1));
err_cr = err_cr- sum(sum( target(:,:end).*log(targetout))) ;
end
test_err(epoch)=(testnumcases*testnumbatches-counter);
test_crerr(epoch)=err_cr/testnumbatches;
fprintf(,'Before epoch %d Train # misclassified: %d (from %d). Test # misclassified: %d (from %d) \t \t \n',...
epoch,train_err(epoch),numcases*numbatches,test_err(epoch),testnumcases*testnumbatches); %%%%%%%%%%%%%% END OF COMPUTING TEST MISCLASSIFICATION ERROR %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% tt=;
for batch = :numbatches/
fprintf(,'epoch %d batch %d\r',epoch,batch); %%%%%%%%%%% COMBINE MINIBATCHES INTO LARGER MINIBATCH %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%组合10个小批次为1000样例的批次,然后用conjugate gradient来进行微调
tt=tt+;
data=[];
targets=[];
for kk=:
data=[data
batchdata(:,:,(tt-)*+kk)]; %10个小批次合成
targets=[targets
batchtargets(:,:,(tt-)*+kk)];
end %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH LINESEARCHES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
max_iter=; %设置线性搜索的次数 if epoch< % First update top-level weights holding other weights fixed.
N = size(data,); %获取数据的行数
XX = [data ones(N,)]; %每行数据后面增加1,用来增加偏置
w1probs = ./( + exp(-XX*w1)); w1probs = [w1probs ones(N,)];
w2probs = ./( + exp(-w1probs*w2)); w2probs = [w2probs ones(N,)];
w3probs = ./( + exp(-w2probs*w3)); %w3probs = [w3probs ones(N,)]; VV = [w_class(:)']'; %VV将随机生成的向量w_class展开成一列???为什么展开成一列与minimize的参数有关
%
Dim = [l4; l5]; %记录最后两层的单元节点数,即2000的隐藏层和10的label层
[X, fX] = minimize(VV,'CG_CLASSIFY_INIT',max_iter,Dim,w3probs,targets);%只训练两层 %%%详细见函数定义
%minimize is Cari Rasmusssen's "minimize" code
%%------------------参数含义------------------%%
%VV 随机权重向量的展开 ,其作为输入参数,列必须为1(D by )
%X 函数f="CG_CLASSIFY_INIT"的最优化参数
%fX 函数f对X的偏导
%max_iter 如果为正,表示线性搜索次数,为负,函数的最大值个数
%%-------------------------------------------------%
w_class = reshape(X,l4+,l5);%恢复权值矩阵结构 else %进入整体微调过程
VV = [w1(:)' w2(:)' w3(:)' w_class(:)']'; %将所有权值按列展开成一列
Dim = [l1; l2; l3; l4; l5]; %记录各层单元个数传入
[X, fX] = minimize(VV,'CG_CLASSIFY',max_iter,Dim,data,targets); w1 = reshape(X(:(l1+)*l2),l1+,l2); %恢复W1权值1.
xxx = (l1+)*l2; %临时变量,用于恢复权值单元
w2 = reshape(X(xxx+:xxx+(l2+)*l3),l2+,l3);
xxx = xxx+(l2+)*l3;
w3 = reshape(X(xxx+:xxx+(l3+)*l4),l3+,l4);
xxx = xxx+(l3+)*l4;
w_class = reshape(X(xxx+:xxx+(l4+)*l5),l4+,l5); end
%%%%%%%%%%%%%%% END OF CONJUGATE GRADIENT WITH LINESEARCHES %%%%%%%%%%%%%%%%%%%%%%%%%%%%% end save mnistclassify_weights w1 w2 w3 w_class
save mnistclassify_error test_err test_crerr train_err train_crerr; end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rbm.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\
epsilonw = 0.1; % Learning rate for weights
epsilonvb = 0.1; % Learning rate for biases of visible units
epsilonhb = 0.1; % Learning rate for biases of hidden units
weightcost = 0.0002;
initialmomentum = 0.5;
finalmomentum = 0.9; [numcases numdims numbatches]=size(batchdata);
%%numcases 每批数据的个数
%%numdims 数据元组的维度
%%numbtches 数据批数 if restart ==,
restart=;
epoch=; % Initializing symmetric weights and biases. 初始化对称权值和偏置
vishid = 0.1*randn(numdims, numhid); %初始化生成可视层到隐藏层的权值
hidbiases = zeros(,numhid);%隐藏单元的偏置值
visbiases = zeros(,numdims);%可见单元的偏置值 poshidprobs = zeros(numcases,numhid); %正向的隐藏单元概率生成
neghidprobs = zeros(numcases,numhid);%反向的隐藏单元概率生成
posprods = zeros(numdims,numhid);%正向可见单元概率生成
negprods = zeros(numdims,numhid);%反向可见单元概率生成
vishidinc = zeros(numdims,numhid);%%%%%可视单元和隐藏单元之间的权值增量
hidbiasinc = zeros(,numhid);%%隐藏单元的偏置增量
visbiasinc = zeros(,numdims);%%可视单元的偏置增量
batchposhidprobs=zeros(numcases,numhid,numbatches);%存储每次迭代计算好的每层的隐藏层概率,作为下一个RBM的输入
end %%%%%%%%%%%%%%%%简单输出 迭代次数 处理的批次%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for epoch = epoch:maxepoch, %迭代处理
fprintf(,'epoch %d\r',epoch);
errsum=; %初始化输出错误为0
for batch = :numbatches, %每次处理一批次的数据
fprintf(,'epoch %d batch %d\r',epoch,batch); %%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
data = batchdata(:,:,batch); %读取当前批次的全部数据vi
poshidprobs = ./( + exp(-data*vishid - repmat(hidbiases,numcases,))); %计算前向传播的隐藏层概率hi
batchposhidprobs(:,:,batch)=poshidprobs;%将计算好的概率赋值给当前批次前向传播的隐藏层最后一次计算好的值作为下一层的输入
posprods = data' * poshidprobs;%contrastive divergence过程<vi,hi> poshidact = sum(poshidprobs);%average-wise隐藏层激活概率值
posvisact = sum(data);%average-wise可视层激活概率值 %%%%%%%%% END OF POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
poshidstates = poshidprobs > rand(numcases,numhid);%gibbs抽样,设定状态 %%%%%%%%% START NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
negdata = ./( + exp(-poshidstates*vishid' - repmat(visbiases,numcases,1)));%根据hi计算vi+1
neghidprobs = ./( + exp(-negdata*vishid - repmat(hidbiases,numcases,))); %根据vi+1计算hi+
negprods = negdata'*neghidprobs;%contrastive divergence <vi+1,hi+1> neghidact = sum(neghidprobs);
negvisact = sum(negdata); %%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
err= sum(sum( (data-negdata).^ )); %重新构建数据的方差
errsum = err + errsum;%整体方差 if epoch>, %迭代次数不同调整冲量
momentum=finalmomentum;
else
momentum=initialmomentum;
end; %%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
vishidinc = momentum*vishidinc + ...
epsilonw*( (posprods-negprods)/numcases - weightcost*vishid);%权重增量计算
visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact);%偏置增量计算
hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact);%隐藏层增量计算 vishid = vishid + vishidinc;
visbiases = visbiases + visbiasinc;
hidbiases = hidbiases + hidbiasinc; %%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end
fprintf(, 'epoch %4i error %6.1f \n', epoch, errsum);
end; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
CG_CLASSIFY_INIT.M
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\
function [f, df] = CG_CLASSIFY_INIT(VV,Dim,w3probs,target);%CG对最上面两层的训练
l1 = Dim();
l2 = Dim();
N = size(w3probs,);
% Do decomversion.
w_class = reshape(VV,l1+,l2); %恢复权重,
w3probs = [w3probs ones(N,)]; %一列,偏置 targetout = exp(w3probs*w_class); %计算label层的输出结果为numbercase*lablesnumber的矩阵
targetout = targetout./repmat(sum(targetout,),,); %选择最后的激活单元,见backpropclassify.m 的76行
f = -sum(sum( target(:,:end).*log(targetout))) ; %交叉熵 只采用了前边部分 IO = (targetout-target(:,:end)); % 输入和输出结果之间的差值
Ix_class=IO; %
dw_class = w3probs'*Ix_class;%导数F(x)((1-F(x))乘以输出结果的偏差..其中F为sigmoid函数 df = [dw_class(:)']'; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
CG_CLASSIFY.M
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 该段代码对所有权重进行整体微调
% 各部分过程见 CG_CLASSIFY_INIT.m注解
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [f, df] = CG_CLASSIFY(VV,Dim,XX,target); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rbmhidlinear.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%除了最后计算单元值采用的是线性单元其余过程全部一样
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 复制代码
Reducing the Dimensionality of data with neural networks / A fast learing algorithm for deep belief net的更多相关文章
- 一天一经典Reducing the Dimensionality of Data with Neural Networks [Science2006]
别看本文没有几页纸,本着把经典的文多读几遍的想法,把它彩印出来看,没想到效果很好,比在屏幕上看着舒服.若用蓝色的笔圈出重点,这篇文章中几乎要全蓝.字字珠玑. Reducing the Dimensio ...
- Deep Learning 16:用自编码器对数据进行降维_读论文“Reducing the Dimensionality of Data with Neural Networks”的笔记
前言 论文“Reducing the Dimensionality of Data with Neural Networks”是深度学习鼻祖hinton于2006年发表于<SCIENCE > ...
- Reducing the Dimensionality of Data with Neural Networks:神经网络用于降维
原文链接:http://www.ncbi.nlm.nih.gov/pubmed/16873662/ G. E. Hinton* and R. R. Salakhutdinov . Science. ...
- 【Deep Learning】Hinton. Reducing the Dimensionality of Data with Neural Networks Reading Note
2006年,机器学习泰斗.多伦多大学计算机系教授Geoffery Hinton在Science发表文章,提出基于深度信念网络(Deep Belief Networks, DBN)可使用非监督的逐层贪心 ...
- 【神经网络】Reducing the Dimensionality of Data with Neural Networks
这篇paper来做什么的? 用神经网络来降维.之前降维用的方法是主成分分析法PCA,找到数据集中最大方差方向.(附:降维有助于分类.可视化.交流和高维信号的存储) 这篇paper提出了一种非线性的PC ...
- 论文阅读---Reducing the Dimensionality of Data with Neural Networks
通过训练多层神经网络可以将高维数据转换成低维数据,其中有对高维输入向量进行改造的网络层.梯度下降可以用来微调如自编码器网络的权重系数,但是对权重的初始化要求比较高.这里提出一种有效初始化权重的方法,允 ...
- Reducing the Dimensionality of Data with Neural Networks
****************内容加密中********************
- 文章“Redcing the Dimensiongality of Data with Neural Networks”的翻译
注明:本人英语水平有限,翻译不当之处,请以英文原版为准,不喜勿喷,另,本文翻译只限于学术交流,不涉及任何版权问题,若有不当侵权或其他任何除学术交流之外的问题,请留言本人,本人立刻删除,谢谢!! 本文原 ...
- Deep learning_CNN_Review:A Survey of the Recent Architectures of Deep Convolutional Neural Networks——2019
CNN综述文章 的翻译 [2019 CVPR] A Survey of the Recent Architectures of Deep Convolutional Neural Networks 翻 ...
随机推荐
- 构建ASP.NET MVC4+EF5+EasyUI+Unity2.x注入的后台管理系统(35)-文章发布系统②-构建项目
原文:构建ASP.NET MVC4+EF5+EasyUI+Unity2.x注入的后台管理系统(35)-文章发布系统②-构建项目 注:阅读本文,需要阅读本系列的之前文章 代码生成器下载地址(文章开头处) ...
- OC中类别、扩展、协议与托付
类别(category)--通过使用类别,我们能够动态地为现有的类加入新方法.并且能够将类定义模块化地分不到多个相关文件里.通常仅仅在类别中定义方法. 类别,接口部分的定义,通常该文件命名为已有&qu ...
- Qt 学习之路 :视图代理
与 Qt model/view 架构类似,在自定义用户界面中,代理扮演着重要的角色.模型中的每一个数据项都要通过一个代理向用户展示,事实上,用户看到的可视部分就是代理. 每一个代理都可以访问一系列属性 ...
- Emoji表情处理
//php对于 Emoji表情的处理 //当接收内容需要转换时: //preg_replace_callback('/[\xf0-\xf7].{3}/','cal_fun', $str) functi ...
- Python的Ftplib问题:UnicodeEncodeError: 'latin-1' codec can't encode characters的解决方法
ftplib中有一个方法是cwd,用来切换目录,需要传入一个dirname,经过个人测试,该dirname不能含有汉字,会抛出:UnicodeEncodeError: 'latin-1' codec ...
- C# 内存管理优化畅想(二)---- 巧用堆栈
这个优化方法比较易懂,就是对于仅在方法内部用到的对象,不再分配在堆上,而是直接在栈上分配,方法结束后立即回收,这将大大减轻GC的压力. 其实,这个优化方法就是java里的逃逸分析,不知为何.net里没 ...
- WPF Binding值转换器ValueConverter使用简介(一)
WPF.Silverlight及Windows Phone程序开发中往往需要将绑定的数据进行特定转换,比如DateTime类型的时间转换为yyyyMMdd的日期,再如有一个值是根据另外多组值的不同而异 ...
- 如何配置visual studio 2013进行负载测试-万事开头难
声明:工作比较忙,文章写得不好,有时间再整理. 起因:最近众包平台因迁移到azure之后一直有网站慢的情况,让老板挨批了,但是测试环境一切正常,而且生产环境也没发现有卡顿和慢的情况,所以干脆来一次负载 ...
- For and While loop choice.
/* Difference between 'for' and 'while'. We can transform everything between 'for' and 'while'. if t ...
- 关于 Repository和UnitOfWork 模式的关系
本以为,关于这方面的理解,园子中的文章已经很多的了,再多做文章真的就“多做文章了”,但是最近发现,还是有必要的,首先,每个人对于同一事物的理解方式和出发点都是不同的,所以思考的方式得到结果也是不同的. ...