How to generate a sample from $p(x)$?

Let's first see how Matlab samples from a $p(x)$. In Matlab, there are several common probability distributions.

Try univariate Gaussian distribution

p= normpdf(linspace(xmin , xmax , num_of_points) , mean, standard_deviation);%PDF
c= normcdf(linspace(xmin , xmax , num_of_points) , mean, standard_deviation);%CDF
y=normrnd(mean, standard_deviation,number_of_samples, 1);%Random Number Generating Method

Try PDF:

x=linespace(-1,1,1000);
p=normpdf(x ,0, 1);
plot(x,p);

Note: linespace returns a vector which is usually accessed like this

x(1)%the first elem, not x(0)
x(:)
x(1,:)

Try RSM:

y=normrnd(0, 1,100, 1);%试试采10000个样本
hist( y , 20 );%20 bars

Try univariate uniform distribution

p= unifpdf(linspace(xmin , xmax , num_of_points) , a,b);%PDF
c= unifcdf(linspace(xmin , xmax , num_of_points) , a,b);%CDF
y=unifrnd(a,b,number_of_samples, 1);%RNG

Try PDF:

x=linespace(-10,10,1000);
p= unifpdf(x ,-5,5);
plot(x,p);

Try RSM:

y=unifrnd(-5, 5,100, 1);%试试采10000个样本
hist( y , 20 );%20 bars

Matlab provides random number generating functions for some standard $p(x)$, it doesn't provide us sampling functions for a general $p(x)$. Here I show some common sampling methods.

Inverse Transform Sampling(ITS)

with descret variables

This method generates random numbers from any probability distribution given the inverse of its cumulative distribution function. The idea is to sample uniformly distributed random numbers (between 0 and 1) and then transform these values using the inverse cumulative distribution function(InvCDF)(which can be descret or continous). If the InvCDF is descrete, then the ITS method just requires a table lookup, like shown in Table 1.

Table 1. Probability of digits observed in human random digit generation experiment

There is a method called randsample in Matlab that can implement the sampling process using the Table 1. See the code below.

%Note: The randsample doesn't defaultly exist in Octave-core package, install statistic package from http://octave.sourceforge.net/statistics/ before using randsample.

% probabilities for each digit
theta=[0.000; ... % digit 0
0.100; ... % digit 1
0.090; ... % digit 2
0.095; ... % digit 3
0.200; ... % digit 4
0.175; ... % digit 5
0.190; ... % digit 6
0.050; ... % digit 7
0.100; ... % digit 8
0.000]; seed = 1; rand( 'state' , seed );% fix the random number generator
K = 10000;% let's say we draw K random values
digitset = 0:9;
Y = randsample(digitset,K,true,theta);
figure( 1 ); clf;
counts = hist( Y , digitset );
bar( digitset , counts , 'k' );
xlim([-0.5 9.5]);
xlabel( 'Digit' );
ylabel( 'Frequency' );
title( 'Distribution of simulated draws of human digit generator' );
pause;

Instead of using the built-in functions such as randsample or mnrnd, it is helpful to consider how to implement the underlying sampling algorithm using the inverse transform method which is:

(1) Calculate $F(X)$.

(2) Sample u from Uniform(0,1).

(3) Get a sample $x^{i}$ of $P(X)$, which is $F(u)^{-1}$.

(4) Repeat (2) and (3) until we get enough samples.

Note: For discrete distributions, $F(X)^{-1}$ is discrete, the way to get a sample $x^{i}$ is illustrated below where $u=0.8,~x^{i}=6$ .

with continuous variables

This can be done with the following procedure:

(1) Draw U ∼ Uniform(0, 1).

(2) Set $X=F(U)^{-1}$

(3) Repeat

For example, we want to sample random numbers from the exponential distribution where  its CDF is F (x|λ) = 1 − exp(−x/λ) . Then $F(u|\gamma)^{-1}=-log(1-u)\gamma$. Therefore replace $F(U)^{-1}$ with $F(u|\gamma)^{-1}$.

p=-log(1-unifrnd(0,1,10000,1))*2;
hist(p,30);

Reject Sampling

Applied situation: impossible/difficult to compute CDF of $P(X)$.

Advantage: unlike MCMC, it doesn't require of any “burn-in” period, i.e., all samples obtained during sampling can immediately be used as samples from the target distribution $p(\theta)$.

Based on the Figure above, the method is:

(1) Choose a proposal distribution q(θ) that is easy to sample from.

(2) Find a constant c such that cq(θ) ≥ p(θ) for all θ.

(3) Draw a proposal θ from q(θ).

(4) Draw a u from Uniform[0, cq(θ)].

(5) Reject the proposal if u > p(θ), accept otherwise. Actually, since u is sampled from Uniform[0, cq(θ)], it is equal to state like this " Reject if $u\in[p(\theta),cq(\theta)]$, accept otherwise".

(6) Repeat steps 3, 4, and 5 until desired number of samples is reached; each accepted sample $\theta$ is a draw from p(θ).

For example

then the code is

k=100000;%draw k samples
c=2;
theta_vec=unifrnd(0,1,k,1)%gen a proposal vector from q($\theta$)
cq_vec=c*unifpdf(theta_vec);%cq(theta) vector
p_vec=2*theta_vec;%p(theta) vector
u_vec=[];
for cq=cq_vec
u_vec=[u_vec;unifrnd(0,cq)];
end
r=theta_vec.*(u_vec<p_vec);
r(r==0)=[];%remove the “0” elements
hist(r,20);

MCMC Sampling

Before getting to know MCMC sampling, we first get to know Monte Carlo Integration and Markov Chain.

  For example:

%Implement the Markov Chain involving x under Beta(200(0.9x^((t-1))+0.05),200(1-0.9x^((t-1)-0.05))

fa=inline('','x')%parameter a for beta
fb=inline('200*(1-0.9*x-0.05)','x');%parameter b for beta
no4mc=4;%4 markove chains
states=unifrnd(0,1,1,no4mc);%initial states
N=1000;%200 samples drawn from 4 chains
X=states;
for i=1:N
states=betarnd(fa(states),fb(states));
X=[X;states];
end;
plot(X);
pause;

Metroplis Sampling

MCMC and Bayesian Data Analysis(PPT在文件模块)的更多相关文章

  1. 《利用Python进行数据分析: Python for Data Analysis 》学习随笔

    NoteBook of <Data Analysis with Python> 3.IPython基础 Tab自动补齐 变量名 变量方法 路径 解释 ?解释, ??显示函数源码 ?搜索命名 ...

  2. 深入浅出数据分析 Head First Data Analysis Code 数据与代码

    <深入浅出数据分析>英文名为Head First Data Analysis Code, 这本书中提供了学习使用的数据和程序,原书链接由于某些原因不 能打开,这里在提供一个下载的链接.去下 ...

  3. 数据分析---《Python for Data Analysis》学习笔记【04】

    <Python for Data Analysis>一书由Wes Mckinney所著,中文译名是<利用Python进行数据分析>.这里记录一下学习过程,其中有些方法和书中不同 ...

  4. 数据分析---《Python for Data Analysis》学习笔记【03】

    <Python for Data Analysis>一书由Wes Mckinney所著,中文译名是<利用Python进行数据分析>.这里记录一下学习过程,其中有些方法和书中不同 ...

  5. 数据分析---《Python for Data Analysis》学习笔记【02】

    <Python for Data Analysis>一书由Wes Mckinney所著,中文译名是<利用Python进行数据分析>.这里记录一下学习过程,其中有些方法和书中不同 ...

  6. 数据分析---《Python for Data Analysis》学习笔记【01】

    <Python for Data Analysis>一书由Wes Mckinney所著,中文译名是<利用Python进行数据分析>.这里记录一下学习过程,其中有些方法和书中不同 ...

  7. Aspose是一个很强大的控件,可以用来操作word,excel,ppt等文件

    Aspose是一个很强大的控件,可以用来操作word,excel,ppt等文件,用这个控件来导入.导出数据非常方便.其中Aspose.Cells就是用来操作Excel的,功能有很多.我所用的是最基本的 ...

  8. 《python for data analysis》第五章,pandas的基本使用

    <利用python进行数据分析>一书的第五章源码与读书笔记 直接上代码 # -*- coding:utf-8 -*-# <python for data analysis>第五 ...

  9. 《python for data analysis》第四章,numpy的基本使用

    <利用python进行数据分析>第四章的程序,介绍了numpy的基本使用方法.(第三章为Ipython的基本使用) 科学计算.常用函数.数组处理.线性代数运算.随机模块…… # -*- c ...

随机推荐

  1. Java日志——2016年5月31日

    1. 三元运算符(A?B:C)属于运算符,表达式必须具有返回值,则A必须是boolean类型值,B和C必须是一个具有返回值的表达式. 2. switch...case本质上只支持int类型的选择判断, ...

  2. thinkPHP--CURD操作

    1.数据创建 2.数据写入 3.数据读取 4.数据更新 5.数据删除 一.数据创建 在数据库添加等操作之前,我们首先需要对数据进行创建.何为数据创建,就是接受提 交过来的数据,比如表单提交的 POST ...

  3. 强强联合之jquery操作angularjs对象

    jquery是一个非常强大的js框架,angularjs是一个非常牛的前端mvc框架.虽然用其中的任何一个框架在项目中够用了,但是有时候这两个框架需要混合着用,虽然不推荐.但有时候混合用时,却非常方便 ...

  4. http协议get、post请求分析及用HttpRequester测试的报错及可能原因

    1.get.post区别 Get Post 获取/提交数据 主要获取数据,不修改数据 主要提交数据,可修改数据 是否需要form表单 不一定 需要 安全性 查询字符串会显示在地址栏的URL中,不安全. ...

  5. 如何理解JS回调函数

    1.回调函数英文解释: A callback is a function that is passed as an argument to another function and is execut ...

  6. 2.struts2访问web资源(在struts2中获取session,request等等)

    什么是web资源:web资源就是指request,response,session,servlet的api 为什么需要访问web资源:因为图片上传,需要获取图片的目录,就需要通过action来访问we ...

  7. 循序渐进Python3(十一) --6--  Ajax 实现跨域请求 jsonp 和 cors

    Ajax操作如何实现跨域请求?       Ajax (XMLHttpRequest)请求受到同源策略的限制.       Ajax通过XMLHttpRequest能够与远程的服务器进行信息交互,另外 ...

  8. HRS(CRLF Injection)

    [HRS(CRLF Injection)] CRLF是”回车 + 换行”(\r\n)的简称.在HTTP协议中,HTTP Header与HTTP Body是用两个CRLF分隔的,浏览器就是根据这两个CR ...

  9. 数据库imp导表dmp的方法

    1>sqlplus / as sysdba 进入sqlplus 2>drop user USER cascade 3>create user USER IDENTIFIED BY P ...

  10. 通过jQuery Ajax使用FormData对象上传文件

    FormData对象,是可以使用一系列的键值对来模拟一个完整的表单,然后使用XMLHttpRequest发送这个"表单". 在 Mozilla Developer 网站 使用For ...