Markov Random Fields
We have seen that directed graphical models specify a factorization of the joint distribution over a set of variables into a product of local conditional distributions. They also define a set of conditional independence properties that must be satisfied by any distribution that factorizes according to the graph. We turn now to the second major class of graphical models that are described by undirected graphs and that again specify both a factorization and a set of conditional independence relations.
A Markov random field, also known as a Markov network or an undirected graphical model, has a set of nodes each of which corresponds to a variable or group of variables, as well as a set of links each of which connects a pair of nodes. The links are undirected, that is they do not carry arrows. In the case of undirected graphs, it is convenient to begin with a discussion of conditional independence properties.
8.3.1 Conditional independence properties
In the case of directed graphs, we saw that it was possible to test whether a particular conditional independence property holds by applying a graphical test called d-separation. This involved testing whether or not the paths connecting two sets of nodes were 'blocked'. The definition of blocked, however, was somewhat subtle due to the presence of paths having head-to-head nodes. We might ask whether it is possible to define an alternative graphical semantics for probability distributions such that conditional independence is determined by simple graph separation. This is indeed the case and corresponds to undirected graphical models. By removing the directionally from the links of the graph, the asymmetry between parent and child nodes is removed, and so the subtleties associated with head-to-head nodes no longer arise.
Suppose that in an undirected graph we identify three sets of nodes, denoted A, B, and C, and what we consider the conditional independence property.
To test whether this property is satisfied by a probability distribution defined by a graph we consider all possible paths that connect nodes in set A to nodes in set B. If all such paths pass through one or more nodes in set C, then all such paths are 'blocked' and so the conditional independence property holds. However, if there is at least one such path that is not blocked, then the property does not necessarily hold, or more precisely there will exist at least some distributions corresponding to the graph that do not satisfy this conditional independence relation. Note that this is exactly the same as the d-separation criterion except that there is no 'explaining away' phenomenon. Testing for conditional independence in undirected graphs is therefore simpler than in directed graphs.
An alternative way t view the conditional independence test is to imagine removing all nodes in set C from the graph together with any node in A to any node in B. If there are no such paths, then the conditional independence property must hold.
The Markov blanket for an undirected graph takes a particularly simple form, because a node will be conditionally independent of all other nodes conditioned only on the neighbouring nodes.
8.3.2 Factorization properties
We now seek a factorization rule for undirected graphs that will correspond to the above conditional independence test. Again, this will involve expressing th joint distribution
If we consider two nodes
where
This leads us to consider a graphical concept called a clique, which is defined as a subset of the nodes in a graph such that there exists a link between all pairs of nodes in the subset. In other words, the set of nodes in a clique is fully connected. Furthermore, a maximal clique is a clique such that it is not possible to include any are illustrated by the undirected graph over four variables. This graph has five cliques of two nodes given by
We can therefore define the factors in the decomposition of the joint distribution to be functions of the variables in the cliques. In fact, we can consider functions of the maximal cliques, without loss of generality, because other cliques must be subsets of maximal cliques. Thus, if
Let us denote a clique by
Here the quantity
which ensures that the distribution
Note that we do not restrict the choice of potential functions to those that have a specific probabilistic interpretation as marginal or conditional distributions. This is in contrast to directed graphs in which each factor represents the conditional distribution of the corresponding variable, conditioned on the state of its parents. However, in special cases, for instance where the undirected graph is constructed by starting with a directed graph, the potential functions may indeed have such an interpretation, as we shall see shortly.
One consequence of the generality of the potential functions
The presence of this normalization constant is one of the major limitations of undirected graphs. If we have a model with
So far, we have discussed the notion of conditional independence based on simple graph separation and we have proposed a factorization of the joint distribution that is intended to correspond to this conditional independence structure. However, we have not made any formal connection independence structure. However, we have not made any formal connection between conditional independence and factorization for undirected graphs. To do so we need to restrict attention to potential functions
To do this we again return to the concept of a graphical model as a filter. Consider he set of all possible distributions defined over a fixed set of variables corresponding to the nodes of a particular undirected graph. We can define
Because we are restricted to potential functions which are strictly positive it is convenient to express them as exponentials, so that
where
In contrast to the factors in the joint distribution for a directed graph, the potentials in an undirected graph do not have a specific probabilities interpretation. Although this gives greater flexibility in choosing the potential functions, because there is no normalization constraint, it does raise the question of how to motivate a choice of potential function for a particular application. This can be done by viewing the potential function as expressing which configurations of the local variables are preferred to others. Global configurations that have a relatively high probability are those that find a good balance in satisfying the (possibly conflicting) influences of the clique potentials.
8.3.4 Relation to directed graphs
We have introduced two graphical frameworks for representing probability distributions, corresponding to directed and undirected graphs, and it is instructive to discuss the relation between these. Consider first the problem of taking a model that is specified using a directed graph and trying to convert it to an undirected graph. In some cases this is straightforward. Here the joint distribution for the directed graph is given as a product of conditionals in the form
Now let us convert this to an undirected graph representation. In the undirected graph, the maximal cliques are simply the pairs of neighbouring nodes, and so from (2) we wish to write the joint distribution in the form
Let us consider how to generalize this construction, so that we can convert any distribution specified by a factorization over a directed graph into one specified by a factorization over an undirected graph. This can be achieved if the clique potentials of the undirected graph are given by the conditional distributions of the directed graph. In order for this to be valid, we must ensure that the set of variables that appears in each of the conditional distributions is a member of at least on clique of the undirected graph. For nodes on the directed graph having just one parent, this is achieved simply by replacing the directed link with an undirected link. However, for nodes in the directed graph having more than one parent, this is not sufficient. These are nodes that have 'head-to-head' paths encountered in our discussion of conditional independence. The joint distribution for the directed graph takes the form
We see that the factor
Thus in general to convert a directed graph into an undirected graph, we first add additional undirected links between all pairs of parents for each node in the graph and then drop the arrows on the original links to give the moral graph. Then we initialize all of the clique potentials of the moral graph to 1. We then take each conditional distribution factor in the original directed graph and multiply it into one of the clique potentials. There will always exist at least one maximal clique that contains all of the variables in the factor as a result of the moralization step. Note that in all cases the partition function is given by
The process of converting a directed graph into an undirected graph plays an important role in exact inference techniques such as the junction tree algorithm. Converting from an undirected to a directed representation is much less common and in general presents problems due to the normalization constraints.
We saw that in going from a directed to an undirected representation we had to discard some conditional independence properties from the graph. Of course, we could always trivially convert any distribution over a directed graph into one over an undirected graph by simply using a fully connected undirected graph. This would, however, discard all conditional independence properties and so would be vacuous. The process of moralization adds the fewest extra links and so retains the maximum number of independence properties.
We have seen that the procedure for determining the conditional independence properties is different between directed and undirected graphs. It turns out that the two types of graph can express different conditional independence properties, and it is worth exploring this issue in more detail. To do so, we return to the view of a specific (directed or undirected) graph as a filter, so that the set of all possible distributions over the given variables could be reduced to a subset that respects the conditional independence implied by the graph. A graph is said to be a D map (for 'dependency map') of a distribution if every conditional independence statement satisfied by the distribution is reflected in the graph. Thus a completely disconnected graph (no links) will be a trivial D amp for any distribution.
Markov Random Fields的更多相关文章
- 马尔可夫随机场(Markov random fields) 概率无向图模型 马尔科夫网(Markov network)
上面两篇博客,解释了概率有向图(贝叶斯网),和用其解释条件独立.本篇将研究马尔可夫随机场(Markov random fields),也叫无向图模型,或称为马尔科夫网(Markov network) ...
- (转)Image Segmentation with Tensorflow using CNNs and Conditional Random Fields
Daniil's blog Machine Learning and Computer Vision artisan. About/ Blog/ Image Segmentation with Ten ...
- Superpixel Based RGB-D Image Segmentation Using Markov Random Field——阅读笔记
1.基本信息 题目:使用马尔科夫场实现基于超像素的RGB-D图像分割: 作者所属:Ferdowsi University of Mashhad(Iron) 发表:2015 International ...
- Conditional Random Fields (CRF) 初理解
1,Conditional Random Fields
- 马尔科夫随机场(Markov Random Field)
马尔可夫随机场(Markov Random Field),它包含两层意思:一是什么是马尔可夫,二是什么是随机场. 马尔可夫过程可以理解为其当前的状态只与上一刻有关而与以前的是没有关系的.X(t+1)= ...
- 论文翻译:Conditional Random Fields as Recurrent Neural Networks
Conditional Random Fields as Recurrent Neural Networks ICCV2015 cite237 1摘要: 像素级标注的重要性(语义分割 图像理解) ...
- an introduction to conditional random fields
1.Structured prediction methods are essentially a combination of classification and graphical modeli ...
- 条件随机场 Conditional Random Fields
简介 假设你有冠西哥一天生活中的照片(这些照片是按时间排好序的),然后你很无聊的想给每张照片打标签(Tag),比如这张是冠西哥在吃饭,那张是冠西哥在睡觉,那么你该怎么做呢? 一种方法是不管这些照片的序 ...
- 随机场(Random field)
一.随机场定义 http://zh.wikipedia.org/zh-cn/随机场 随机场(Random field)定义如下: 在概率论中, 由样本空间Ω = {0, 1, …, G − 1}n取样 ...
随机推荐
- python发送邮件方法
1.普通文本邮件 #!/usr/bin/env python # -*- coding:utf-8 -*- import smtplib from email.mime.text import MIM ...
- qq2440启动linux后出现错误提示request_module: runaway loop modprobe binfmt-464c
1.情景: 编译busybox时加了make CROSS_COMPILE=arm-linux-,但是还是出现了此情况! 2.解决方案如下: 配置busybox时,在配置中发现busybox setti ...
- 一步一步学习underscore的封装和扩展方式
前言 underscore虽然有点过时,这些年要慢慢被Lodash给淘汰或合并. 但通过看它的源码,还是能学到一个库的封装和扩展方式. 第一步,不污染全局环境. ES5中的JS作用域是函数作用域. 函 ...
- react项目组件化思考
三个原则 single store render from top immutable data single store,便于组件之间通信. render from top,因为store就一个,每 ...
- supervisor安装和配置
直接命令 easy_install supervisor 如果报错先安装 yum install python-setuptools,再上面一条命令: 安装成功后显示finished,我们再次进行py ...
- javacsript Numnber 对象
引子: initUppercent = (uploadedSize / file.size * 100).toFixed(2) + '%'; 一.javaScript Number对象 ------- ...
- spring,maven,dubbo配置
首先我写的这个不是介绍原理的东西,只是指明在我在使用的过程中遇见的一些疑惑的,最后我的理解,你要看详细的配置的话可以看网上的,这个一大堆的.其实dubbo的原理从模型上来看是很简单的东西,完全可以把这 ...
- 浏览器html页面乱码问题分析
直接访问某html文件,浏览器显示编码是正常的,页面通过<meta charset="UTF-8">指定了编码方式,该文件存储编码也是utf8. 通过配置的org.sp ...
- centos6.5安装sublime text 2
今天在看ueillemmx的博客的时候,看到一神级编辑器,随即安装试了试,我了个去,果然好用,自动补全,自动对齐,样样精通啊! 下面是根据ueillemmx的步骤在CentOS上安装Sublime的过 ...
- 运行Python脚本的方法
1.安装完Python后,添加环境变量---在系统变量中找到Path ,点击编辑把你的python安装目录放到里面,注意环境变量之间用";"隔开.打开CMD,在CMD命令行中,输入 ...