转载,原贴地址:Introduction to Restricted Boltzmann Machines,by Edwin Chen, 2011/07/18.

Suppose you ask a bunch of users to rate a set of movies on a 0-100 scale. In classical factor analysis, you could then try to explain each movie and user in terms of a set of latent factors. For example, movies like Star Wars and Lord of the Rings might have strong associations with a latent science fiction and fantasy factor, and users who like Wall-E and Toy Story might have strong associations with a latent Pixar factor.

Restricted Boltzmann Machines essentially perform a binary version of factor analysis. (This is one way of thinking about RBMs; there are, of course, others, and lots of different ways to use RBMs, but I’ll adopt this approach for this post.) Instead of users rating a set of movies on a continuous scale, they simply tell you whether they like a movie or not, and the RBM will try to discover latent factors that can explain the activation of these movie choices.

More technically, a Restricted Boltzmann Machine is a stochastic neural network (neural network meaning we have neuron-like units whose binary activations depend on the neighbors they’re connected to; stochastic meaning these activations have a probabilistic element) consisting of:

  • One layer of visible units (users’ movie preferences whose states we know and set);
  • One layer of hidden units (the latent factors we try to learn); and
  • A bias unit (whose state is always on, and is a way of adjusting for the different inherent popularities of each movie).

Furthermore, each visible unit is connected to all the hidden units (this connection is undirected, so each hidden unit is also connected to all the visible units), and the bias unit is connected to all the visible units and all the hidden units. To make learning easier, we restrict the network so that no visible unit is connected to any other visible unit and no hidden unit is connected to any other hidden unit.

For example, suppose we have a set of six movies (Harry Potter, Avatar, LOTR 3, Gladiator, Titanic, and Glitter) and we ask users to tell us which ones they want to watch. If we want to learn two latent units underlying movie preferences – for example, two natural groups in our set of six movies appear to be SF/fantasy (containing Harry Potter, Avatar, and LOTR 3) and Oscar winners (containing LOTR 3, Gladiator, and Titanic), so we might hope that our latent units will correspond to these categories – then our RBM would look like the following:

(Note the resemblance to a factor analysis graphical model.)

State Activation

Restricted Boltzmann Machines, and neural networks in general, work by updating the states of some neurons given the states of others, so let’s talk about how the states of individual units change. Assuming we know the connection weights in our RBM (we’ll explain how to learn these below), to update the state of unit i :

For example, let’s suppose our two hidden units really do correspond to SF/fantasy and Oscar winners.

  • If Alice has told us her six binary preferences on our set of movies, we could then ask our RBM which of the hidden units her preferences activate (i.e., ask the RBM to explain her preferences in terms of latent factors). So the six movies send messages to the hidden units, telling them to update themselves. (Note that even if Alice has declared she wants to watch Harry Potter, Avatar, and LOTR 3, this doesn’t guarantee that the SF/fantasy hidden unit will turn on, but only that it will turn on with high probability. This makes a bit of sense: in the real world, Alice wanting to watch all three of those movies makes us highly suspect she likes SF/fantasy in general, but there’s a small chance she wants to watch them for other reasons. Thus, the RBM allows us to generate models of people in the messy, real world.)
  • Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy unit is on), we can then ask the RBM which of the movie units that hidden unit turns on (i.e., ask the RBM to generate a set of movie recommendations). So the hidden units send messages to the movie units, telling them to update their states. (Again, note that the SF/fantasy unit being on doesn’t guarantee that we’ll always recommend all three of Harry Potter, Avatar, and LOTR 3 because, hey, not everyone who likes science fiction liked Avatar.)

Learning Weights

So how do we learn the connection weights in our network? Suppose we have a bunch of training examples, where each training example is a binary vector with six elements corresponding to a user’s movie preferences. Then for each epoch, do the following:

Continue until the network converges (i.e., the error between the training examples and their reconstructions falls below some threshold) or we reach some maximum number of epochs.

Why does this update rule make sense? Note that

(You may hear this update rule called contrastive divergence, which is basically a fancy term for “approximate gradient descent”.)

Examples

I wrote a simple RBM implementation in Python (the code is heavily commented, so take a look if you’re still a little fuzzy on how everything works), so let’s use it to walk through some examples.

First, I trained the RBM using some fake data.

  • Alice: (Harry Potter = 1, Avatar = 1, LOTR 3 = 1, Gladiator = 0, Titanic = 0, Glitter = 0). Big SF/fantasy fan.
  • Bob: (Harry Potter = 1, Avatar = 0, LOTR 3 = 1, Gladiator = 0, Titanic = 0, Glitter = 0). SF/fantasy fan, but doesn’t like Avatar.
  • Carol: (Harry Potter = 1, Avatar = 1, LOTR 3 = 1, Gladiator = 0, Titanic = 0, Glitter = 0). Big SF/fantasy fan.
  • David: (Harry Potter = 0, Avatar = 0, LOTR 3 = 1, Gladiator = 1, Titanic = 1, Glitter = 0). Big Oscar winners fan.
  • Eric: (Harry Potter = 0, Avatar = 0, LOTR 3 = 1, Gladiator = 1, Titanic = 1, Glitter = 0). Oscar winners fan, except for Titanic.
  • Fred: (Harry Potter = 0, Avatar = 0, LOTR 3 = 1, Gladiator = 1, Titanic = 1, Glitter = 0). Big Oscar winners fan.

The network learned the following weights:

Note that the first hidden unit seems to correspond to the Oscar winners, and the second hidden unit seems to correspond to the SF/fantasy movies, just as we were hoping.

What happens if we give the RBM a new user, George, who has (Harry Potter = 0, Avatar = 0, LOTR 3 = 0, Gladiator = 1, Titanic = 1, Glitter = 0) as his preferences? It turns the Oscar winners unit on (but not the SF/fantasy unit), correctly guessing that George probably likes movies that are Oscar winners.

What happens if we activate only the SF/fantasy unit, and run the RBM a bunch of different times? In my trials, it turned on Harry Potter, Avatar, and LOTR 3 three times; it turned on Avatar and LOTR 3, but not Harry Potter, once; and it turned on Harry Potter and LOTR 3, but not Avatar, twice. Note that, based on our training examples, these generated preferences do indeed match what we might expect real SF/fantasy fans want to watch.

Modifications

I tried to keep the connection-learning algorithm I described above pretty simple, so here are some modifications that often appear in practice:

Introduction to Restricted Boltzmann Machines的更多相关文章

  1. 受限波兹曼机导论Introduction to Restricted Boltzmann Machines

    Suppose you ask a bunch of users to rate a set of movies on a 0-100 scale. In classical factor analy ...

  2. (六)6.14 Neurons Networks Restricted Boltzmann Machines

    1.RBM简介 受限玻尔兹曼机(Restricted Boltzmann Machines,RBM)最早由hinton提出,是一种无监督学习方法,即对于给定数据,找到最大程度拟合这组数据的参数.RBM ...

  3. CS229 6.14 Neurons Networks Restricted Boltzmann Machines

    1.RBM简介 受限玻尔兹曼机(Restricted Boltzmann Machines,RBM)最早由hinton提出,是一种无监督学习方法,即对于给定数据,找到最大程度拟合这组数据的参数.RBM ...

  4. Convolutional Restricted Boltzmann Machines

    参考论文:1.Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning   ...

  5. 限制波尔兹曼机(Restricted Boltzmann Machines)

    能量模型的概念从统计力学中得来,它描述着整个系统的某种状态,系统越有序,系统能量波动越小,趋近于平衡状态,系统越无序,能量波动越大.例如:一个孤立的物体,其内部各处的温度不尽相同,那么热就从温度较高的 ...

  6. Restricted Boltzmann Machines

    转自:http://deeplearning.net/tutorial/rbm.html http://blog.csdn.net/mytestmy/article/details/9150213 能 ...

  7. 受限玻尔兹曼机(RBM, Restricted Boltzmann machines)和深度信念网络(DBN, Deep Belief Networks)

    受限玻尔兹曼机对于当今的非监督学习有一定的启发意义. 深度信念网络(DBN, Deep Belief Networks)于2006年由Geoffery Hinton提出.

  8. 限制Boltzmann机(Restricted Boltzmann Machine)

    起源:Boltzmann神经网络 Boltzmann神经网络的结构是由Hopfield递归神经网络改良过来的,Hopfield中引入了统计物理学的能量函数的概念. 即,cost函数由统计物理学的能量函 ...

  9. 限制玻尔兹曼机(Restricted Boltzmann Machine)RBM

    假设有一个二部图,每一层的节点之间没有连接,一层是可视层,即输入数据是(v),一层是隐藏层(h),如果假设所有的节点都是随机二值变量节点(只能取0或者1值)同时假设全概率分布满足Boltzmann 分 ...

随机推荐

  1. server 2008 R2 DHCP服务器部署

    安装DHCP服务器 和上一篇文章中安装IIS 7.0一样,我们在安装DHCP服务器的时候也要用到Windows Server 2008的服务器安装器. 首先打开服务器管理器,点击开始菜单——>管 ...

  2. 【计算机视觉】UCLA开源图像检测器

    UCLA (加州大学洛杉矶分校)发布了一个强大的图像检测软件的源码 ,该软件可以非常高速的检测每个图像的细节,例如可用于检测指纹和虹膜,或者用于自动驾驶.通过识别物体的边界进行提取.首先确定一个物体的 ...

  3. 使用Java实现hello/hi的简单网络聊天程序

    Socket又称套接字,是基于应用服务与TCP/IP通信之间的一个抽象,它是计算机之间进行通信的一种约定或一种方式.通过socket这种约定,一台计算机可以接收其他计算机的数据,也可以向其他计算机发送 ...

  4. 小记--------sparksql和DataFrame的小小案例java、scala版本

    sparksql是spark中的一个模块,主要用于进行结构化数据的处理,他提供的最核心的编程抽象,就是DataFrame.同时,sparksql还可以作为分布式的sql查询引擎. 最最重要的功能就是从 ...

  5. Spring4学习回顾之路01—HelloWorld

    以前公司一直使用的是spring3.0,最近一段时间开始用了4.0,官网上都已经有了5.0,但是很多知识点已经忘了差不多了,趁现在项目不忙写写随笔,一来回顾自己的知识点,二来如果能帮助比我还小白的小白 ...

  6. GET POST请求区别

    cookie .session.tokencookie:存放在浏览器相关的硬盘文件中session:存放在服务器端的内存中,退出后,被清空token:服务器端生成后,不保存,发给客户端,客户端的hea ...

  7. django 项目开发及部署遇到的坑

    1.django 连接oracle数据库遇到的坑 需求:通过plsql建立的oracle数据表,想要django操作这几个表 python manage.py inspectdb table_name ...

  8. docker推送镜像到docker本地仓库报错:http: server gave HTTP response to HTTPS client

    因为Docker从1.3.X之后,与docker registry交互默认使用的是https,然而此处搭建的私有仓库只提供http服务,所以当与私有仓库交互时就会报上面的错误. 解决办法: vim / ...

  9. Java基础第三天--内部类、常用API

    形参和返回值 抽象类名作为形参和返回值 方法的形参是抽象类名,其实需要的是该抽象类的子类对象 方法的返回值是抽象类名,其实返回的是该抽象类的子类对象 接口名作为形参和返回值 方法的形象是接口名,其实需 ...

  10. vue-loading图

    父组件给子组件src地址: columns(){ return [ {'title': '图片', 'key': 'img', render(h, {row}){ return h(LoadingIm ...