Introduction to Restricted Boltzmann Machines
转载,原贴地址:Introduction to Restricted Boltzmann Machines,by Edwin Chen, 2011/07/18.
Suppose you ask a bunch of users to rate a set of movies on a 0-100 scale. In classical factor analysis, you could then try to explain each movie and user in terms of a set of latent factors. For example, movies like Star Wars and Lord of the Rings might have strong associations with a latent science fiction and fantasy factor, and users who like Wall-E and Toy Story might have strong associations with a latent Pixar factor.
Restricted Boltzmann Machines essentially perform a binary version of factor analysis. (This is one way of thinking about RBMs; there are, of course, others, and lots of different ways to use RBMs, but I’ll adopt this approach for this post.) Instead of users rating a set of movies on a continuous scale, they simply tell you whether they like a movie or not, and the RBM will try to discover latent factors that can explain the activation of these movie choices.
More technically, a Restricted Boltzmann Machine is a stochastic neural network (neural network meaning we have neuron-like units whose binary activations depend on the neighbors they’re connected to; stochastic meaning these activations have a probabilistic element) consisting of:
- One layer of visible units (users’ movie preferences whose states we know and set);
- One layer of hidden units (the latent factors we try to learn); and
- A bias unit (whose state is always on, and is a way of adjusting for the different inherent popularities of each movie).
Furthermore, each visible unit is connected to all the hidden units (this connection is undirected, so each hidden unit is also connected to all the visible units), and the bias unit is connected to all the visible units and all the hidden units. To make learning easier, we restrict the network so that no visible unit is connected to any other visible unit and no hidden unit is connected to any other hidden unit.
For example, suppose we have a set of six movies (Harry Potter, Avatar, LOTR 3, Gladiator, Titanic, and Glitter) and we ask users to tell us which ones they want to watch. If we want to learn two latent units underlying movie preferences – for example, two natural groups in our set of six movies appear to be SF/fantasy (containing Harry Potter, Avatar, and LOTR 3) and Oscar winners (containing LOTR 3, Gladiator, and Titanic), so we might hope that our latent units will correspond to these categories – then our RBM would look like the following:

(Note the resemblance to a factor analysis graphical model.)
State Activation
Restricted Boltzmann Machines, and neural networks in general, work by updating the states of some neurons given the states of others, so let’s talk about how the states of individual units change. Assuming we know the connection weights in our RBM (we’ll explain how to learn these below), to update the state of unit i :

For example, let’s suppose our two hidden units really do correspond to SF/fantasy and Oscar winners.
- If Alice has told us her six binary preferences on our set of movies, we could then ask our RBM which of the hidden units her preferences activate (i.e., ask the RBM to explain her preferences in terms of latent factors). So the six movies send messages to the hidden units, telling them to update themselves. (Note that even if Alice has declared she wants to watch Harry Potter, Avatar, and LOTR 3, this doesn’t guarantee that the SF/fantasy hidden unit will turn on, but only that it will turn on with high probability. This makes a bit of sense: in the real world, Alice wanting to watch all three of those movies makes us highly suspect she likes SF/fantasy in general, but there’s a small chance she wants to watch them for other reasons. Thus, the RBM allows us to generate models of people in the messy, real world.)
- Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy unit is on), we can then ask the RBM which of the movie units that hidden unit turns on (i.e., ask the RBM to generate a set of movie recommendations). So the hidden units send messages to the movie units, telling them to update their states. (Again, note that the SF/fantasy unit being on doesn’t guarantee that we’ll always recommend all three of Harry Potter, Avatar, and LOTR 3 because, hey, not everyone who likes science fiction liked Avatar.)
Learning Weights
So how do we learn the connection weights in our network? Suppose we have a bunch of training examples, where each training example is a binary vector with six elements corresponding to a user’s movie preferences. Then for each epoch, do the following:

Continue until the network converges (i.e., the error between the training examples and their reconstructions falls below some threshold) or we reach some maximum number of epochs.
Why does this update rule make sense? Note that

(You may hear this update rule called contrastive divergence, which is basically a fancy term for “approximate gradient descent”.)
Examples
I wrote a simple RBM implementation in Python (the code is heavily commented, so take a look if you’re still a little fuzzy on how everything works), so let’s use it to walk through some examples.
First, I trained the RBM using some fake data.
- Alice: (Harry Potter = 1, Avatar = 1, LOTR 3 = 1, Gladiator = 0, Titanic = 0, Glitter = 0). Big SF/fantasy fan.
- Bob: (Harry Potter = 1, Avatar = 0, LOTR 3 = 1, Gladiator = 0, Titanic = 0, Glitter = 0). SF/fantasy fan, but doesn’t like Avatar.
- Carol: (Harry Potter = 1, Avatar = 1, LOTR 3 = 1, Gladiator = 0, Titanic = 0, Glitter = 0). Big SF/fantasy fan.
- David: (Harry Potter = 0, Avatar = 0, LOTR 3 = 1, Gladiator = 1, Titanic = 1, Glitter = 0). Big Oscar winners fan.
- Eric: (Harry Potter = 0, Avatar = 0, LOTR 3 = 1, Gladiator = 1, Titanic = 1, Glitter = 0). Oscar winners fan, except for Titanic.
- Fred: (Harry Potter = 0, Avatar = 0, LOTR 3 = 1, Gladiator = 1, Titanic = 1, Glitter = 0). Big Oscar winners fan.
The network learned the following weights:

Note that the first hidden unit seems to correspond to the Oscar winners, and the second hidden unit seems to correspond to the SF/fantasy movies, just as we were hoping.
What happens if we give the RBM a new user, George, who has (Harry Potter = 0, Avatar = 0, LOTR 3 = 0, Gladiator = 1, Titanic = 1, Glitter = 0) as his preferences? It turns the Oscar winners unit on (but not the SF/fantasy unit), correctly guessing that George probably likes movies that are Oscar winners.
What happens if we activate only the SF/fantasy unit, and run the RBM a bunch of different times? In my trials, it turned on Harry Potter, Avatar, and LOTR 3 three times; it turned on Avatar and LOTR 3, but not Harry Potter, once; and it turned on Harry Potter and LOTR 3, but not Avatar, twice. Note that, based on our training examples, these generated preferences do indeed match what we might expect real SF/fantasy fans want to watch.
Modifications
I tried to keep the connection-learning algorithm I described above pretty simple, so here are some modifications that often appear in practice:

Introduction to Restricted Boltzmann Machines的更多相关文章
- 受限波兹曼机导论Introduction to Restricted Boltzmann Machines
Suppose you ask a bunch of users to rate a set of movies on a 0-100 scale. In classical factor analy ...
- (六)6.14 Neurons Networks Restricted Boltzmann Machines
1.RBM简介 受限玻尔兹曼机(Restricted Boltzmann Machines,RBM)最早由hinton提出,是一种无监督学习方法,即对于给定数据,找到最大程度拟合这组数据的参数.RBM ...
- CS229 6.14 Neurons Networks Restricted Boltzmann Machines
1.RBM简介 受限玻尔兹曼机(Restricted Boltzmann Machines,RBM)最早由hinton提出,是一种无监督学习方法,即对于给定数据,找到最大程度拟合这组数据的参数.RBM ...
- Convolutional Restricted Boltzmann Machines
参考论文:1.Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning ...
- 限制波尔兹曼机(Restricted Boltzmann Machines)
能量模型的概念从统计力学中得来,它描述着整个系统的某种状态,系统越有序,系统能量波动越小,趋近于平衡状态,系统越无序,能量波动越大.例如:一个孤立的物体,其内部各处的温度不尽相同,那么热就从温度较高的 ...
- Restricted Boltzmann Machines
转自:http://deeplearning.net/tutorial/rbm.html http://blog.csdn.net/mytestmy/article/details/9150213 能 ...
- 受限玻尔兹曼机(RBM, Restricted Boltzmann machines)和深度信念网络(DBN, Deep Belief Networks)
受限玻尔兹曼机对于当今的非监督学习有一定的启发意义. 深度信念网络(DBN, Deep Belief Networks)于2006年由Geoffery Hinton提出.
- 限制Boltzmann机(Restricted Boltzmann Machine)
起源:Boltzmann神经网络 Boltzmann神经网络的结构是由Hopfield递归神经网络改良过来的,Hopfield中引入了统计物理学的能量函数的概念. 即,cost函数由统计物理学的能量函 ...
- 限制玻尔兹曼机(Restricted Boltzmann Machine)RBM
假设有一个二部图,每一层的节点之间没有连接,一层是可视层,即输入数据是(v),一层是隐藏层(h),如果假设所有的节点都是随机二值变量节点(只能取0或者1值)同时假设全概率分布满足Boltzmann 分 ...
随机推荐
- Java虚拟机new对象
类加载检查java虚拟机在遇到一条 new 指令时,首先会检查是否能在常量池中定位到这个类的符号引用,并且是否已被加载过.解析和初始化过.如果没有,那必须先执行类加载过程 类加载的相关知识可参考:JV ...
- 【HANA系列】对话SAP全球CEO孟鼎铭:未来最大的发展机遇属于中国中小企业
公众号:SAP Technical 本文作者:matinal 原文出处:http://www.cnblogs.com/SAPmatinal/ 原文链接:[HANA系列]对话SAP全球CEO孟鼎铭:未来 ...
- zebra 配置问题导致服务起不来
由于配置错误的原因,导致 zebra 起不来,具体报错如下: zebra 起不来,导致 ospf 也起不来,报错如下: Job ospfd.service/start failed with resu ...
- 【机器学习】ICA特征提取
看完了ICA的一整套原理介绍后,感觉完整的介绍和andrew ng的课程中的ICA特征提取关系不是很大:在ICA的理论中,主要用于盲源分离的,也就是混合的观测数据X,通过一个正交的且其范数为1的分离矩 ...
- ROS topic,service和action的使用场景
参考:ROS中关于topic和service的运用场合 Topics 特点: 1.单向,分工明确,处理连续数据流,topic是一种多对多的形式,一个Node可以订阅多个Topic,可以publish到 ...
- Spring之一:IoC容器体系结构
温故而知心. Spring IoC概述 常说spring的控制反转(依赖反转),看看维基百科的解释: 如果合作对象的引用或依赖关系的管理要由具体对象来完成,会导致代码的高度耦合和可测试性降低,这对复杂 ...
- Magazine Delivery(POJ1695)【DP】
题意:要求用三辆车往n座城市投递货物,起点都在一号城市,每辆车可以载任意数量的货物,投递顺序必须与城市编号递增序一致,并且,每次同时都只能有一辆车在跑路.求最短总路径之和. 思路:每时每刻,能够充分决 ...
- HTML DOM focus() 方法
目录 HTML DOM focus() 方法 实例 定义和使用 浏览器支持 语法 参数 技术描述 更多实例 实例 实例 HTML DOM focus() 方法 实例 为 <a> 元素设置焦 ...
- 关于vs code文本编辑器的快捷键
另一篇编辑器Sublime Text下载.使用教程.插件推荐说明.全套快捷键 基础编辑 快捷键 作用 Ctrl+X 剪切 Ctrl+C 复制 Ctrl+Shift+K 删除当前行 Ctrl+Enter ...
- sql之游标
--select * from master..sysprocessesuse testdeclare my_cursor cursor scroll dynamic --scroll表示可以向前 ...