SAS PROC MCMC example in R: Logistic Regression Random-Effects Model(转)
In this post I will run SAS example Logistic Regression Random-Effects Model in four R based solutions; Jags, STAN, MCMCpack and LaplacesDemon. To quote the SAS manual: 'The data are taken from Crowder (1978). The Seeds data set is a 2 x 2 factorial layout, with two types of seeds, O. aegyptiaca 75 and O. aegyptiaca 73, and two root extracts, bean and cucumber. You observe r, which is the number of germinated seeds, and n, which is the total number of seeds. The independent variables are seed and extract.'
The point of this exercise is to demonstrate a random effect. Each observation has a random effect associated with it. In contrast the other parameters have non-informative priors. As such, the models are not complex.
Previous post in the series PROC MCMC examples programmed in R were: example 1: sampling, example 2: Box Cox transformation,example 5: Poisson regression and example 6: Non-Linear Poisson Regression.
JAGS
To make things easy on myself I wondered if this data was already present as R data. That is when I discovered this post at Grimwisdom doing exactly what I wanted as JAGS program. Hence that code was the basis for this JAGS code.
One thing on the models and distributions. JAGS uses tau (1/variance) as hyperparameter for the delta parameters and tau has gamma distribution. In contrast, all other programs use standard deviation as hyperparameter for delta parameter and hence gets the inverse gamma distribution. This distribution is available in the MCMCpack package and also in STAN. Gelman has a publication on this kind of priors:Prior distributions for variance parameters in hierarchical models.
library(R2jags)
data(orob2,package='aod')
seedData <- list ( N = 21,
r = orob2$y,
n = orob2$n,
x1 = c(1,0)[orob2$seed],
x2 = c(0,1)[orob2$root]
) modelRandomEffect <- function() {
for(i in 1:4) {alpha[i] ~ dnorm(0.0,1.0E-6)}
for(i in 1:N) {delta[i] ~ dnorm(0.0,tau)} for (i in 1:N) {
logit(p[i]) <- alpha[1] +
alpha[2]*x1[i] +
alpha[3]*x2[i] +
alpha[4]*x1[i]*x2[i] +
delta[i];
r[i] ~ dbin(p[i],n[i]);
}
tau ~ dgamma(.01,0.01)
s2 <- 1/tau
}
params <- c('alpha','s2')
myj <-jags(model=modelRandomEffect,
data = seedData,
parameters=params)
myj
Inference for Bugs model at "/tmp/RtmpKURJBm/model70173774d0c.txt", fit using jags,
3 chains, each with 2000 iterations (first 1000 discarded)
n.sims = 3000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha[1] -0.538 0.191 -0.907 -0.664 -0.542 -0.419 -0.135 1.005 440
alpha[2] 0.078 0.300 -0.550 -0.111 0.083 0.269 0.646 1.005 440
alpha[3] 1.342 0.284 0.768 1.164 1.340 1.524 1.895 1.004 1400
alpha[4] -0.807 0.441 -1.663 -1.098 -0.817 -0.521 0.046 1.009 300
s2 0.105 0.096 0.010 0.041 0.077 0.138 0.366 1.051 47
deviance 101.236 6.653 89.853 96.595 100.903 105.313 114.371 1.022 95
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 21.7 and DIC = 122.9
DIC is an estimate of expected predictive error (lower deviance is better).
MCMCpack
MCMCpack had some difficulty with this particular prior for s2. In the end I chose inverse Gamma(0.1,0.1) that worked. Therefor parameters estimates turn out slightly different. For conciseness only the first parameters are displayed in the output.
library(MCMCpack)
xmat <- cbind(rep(1,21),seedData$x1,seedData$x2,seedData$x1*seedData$x2)
MCMCfun <- function(parm) {
beta=parm[1:4]
s2=parm[5]
delta=parm[5+(1:21)]
if(s2<0 ) return(-Inf)
step1 <- xmat %*% beta + delta
p <- LaplacesDemon::invlogit(step1)
LL <- sum(dbinom(seedData$r,seedData$n,p,log=TRUE))
prior <- sum(dnorm(beta,0,1e3,log=TRUE))+
sum(dnorm(delta,0,sqrt(s2),log=TRUE))+
log(dinvgamma(s2,.1,.1))
LP=LL+prior
return(LP)
}
inits <- c(rnorm(4,0,1),runif(1,0,1),rnorm(21,0,1))
names(inits) <- c(paste('beta',0:3,sep=''),
's2',
paste('delta',1:21,sep=''))
mcmcout <- MCMCmetrop1R(MCMCfun,
inits)
summary(mcmcout)
Iterations = 501:20500
Thinning interval = 1
Number of chains = 1
Sample size per chain = 20000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
[1,] -0.59046 0.30456 0.0021535 0.03613
[2,] 0.01476 0.47526 0.0033606 0.04866
[3,] 1.48129 0.38227 0.0027031 0.03883
[4,] -0.93754 0.66414 0.0046962 0.07025
[5,] 0.35146 0.08004 0.0005659 0.05020
STAN
Stan had no problems at all with this model.
library(rstan)
smodel <- '
data {
int <lower=1> N;
int n[N];
int r[N];
real x1[N];
real x2[N];
}
parameters {
real Beta[4];
real <lower=0> s2;
real Delta[N];
}
transformed parameters {
vector[N] mu;
for (i in 1:N) {
mu[i] <- inv_logit(
Beta[1] + Beta[2]*x1[i] +
Beta[3]*x2[i]+Beta[4]*x1[i]*x2[i]+
Delta[i]);
}
}
model {
r ~ binomial(n,mu);
s2 ~ inv_gamma(.01,.01);
Delta ~ normal(0,sqrt(s2));
Beta ~ normal(0,1000);
}
'
fstan <- stan(model_code = smodel,
data = seedData,
pars=c('Beta','s2'))
fstan
Inference for Stan model: smodel.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
Beta[1] -0.56 0.01 0.20 -0.97 -0.68 -0.55 -0.43 -0.14 1370 1.00
Beta[2] 0.08 0.01 0.33 -0.58 -0.12 0.08 0.29 0.72 1385 1.00
Beta[3] 1.36 0.01 0.29 0.83 1.18 1.36 1.54 1.97 1355 1.00
Beta[4] -0.83 0.01 0.46 -1.76 -1.12 -0.82 -0.53 0.04 1434 1.00
s2 0.12 0.00 0.11 0.02 0.06 0.10 0.16 0.42 588 1.01
lp__ -523.70 0.34 6.86 -537.26 -528.19 -523.73 -519.21 -509.87 403 1.01
Samples were drawn using NUTS(diag_e) at Sat Jan 17 18:27:38 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
LaplacesDemon
While in the MCMCpack code I borrowed from LaplacesDemon the invlogit() function, in LaplacesDemon I borrow the InvGamma distribution. I guess it evens out. For a change no further tweaking in the code. Note that the core likelihood function is the same as MCMCpack. However, LaplacesDemon is able to use the correct prior. For brevity again part of the output has not been copied in the blog.
library('LaplacesDemon')
mon.names <- "LP"
parm.names <- c(paste('beta',0:3,sep=''),
's2',
paste('delta',1:21,sep=''))
PGF <- function(Data) {
x <-c(rnorm(21+4+1,0,1))
x[5] <- runif(1,0,2)
x
}
MyData <- list(mon.names=mon.names,
parm.names=parm.names,
PGF=PGF,
xmat = xmat,
r=seedData$r,
n=seedData$n)
N<-1
Model <- function(parm, Data)
{
beta=parm[1:4]
s2=parm[5]
delta=parm[5+(1:21)]
# if(s2<0 ) return(-Inf)
step1 <- xmat %*% beta + delta
p <- invlogit(step1)
LL <- sum(dbinom(seedData$r,seedData$n,p,log=TRUE))
tau <- 1/s2
prior <- sum(dnorm(beta,0,1e3,log=TRUE))+
sum(dnorm(delta,0,sqrt(s2),log=TRUE))+
log(dinvgamma(s2,.01,.01))
LP=LL+prior
Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
yhat=p,
parm=parm)
return(Modelout)
}
Initial.Values <- GIV(Model, MyData, PGF=TRUE)
Fit1 <- LaplacesDemon(Model,
Data=MyData,
Initial.Values = Initial.Values
)
Fit1
Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values)
Acceptance Rate: 0.67594
Algorithm: Metropolis-within-Gibbs
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
beta0 beta1 beta2 beta3 s2 delta1 delta2 delta3
0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082
delta4 delta5 delta6 delta7 delta8 delta9 delta10 delta11
0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082
delta12 delta13 delta14 delta15 delta16 delta17 delta18 delta19
0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082
delta20 delta21
0.218082 0.218082
Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
All Stationary
Dbar 100.063 100.063
pD 26.630 26.630
DIC 126.692 126.692
Initial Values:
[1] 1.385256609 -0.634946833 1.456635236 -0.041162276 1.883504417
[6] -1.380783003 -0.688367493 0.210060822 0.127231904 0.710367572
[11] -0.865780359 -1.649760777 -0.005532662 -0.114739142 0.642440639
[16] -0.919494616 -0.829018195 -0.938486769 0.302152995 -1.877933490
[21] 1.170542660 0.131282852 0.210852443 0.808779058 -2.115209547
[26] 0.431205368
Iterations: 10000
Log(Marginal Likelihood): -27.92817
Minutes of run-time: 0.81
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 26
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 0
Recommended Burn-In of Un-thinned Samples: 0
Recommended Thinning: 240
Specs: (NOT SHOWN HERE)
Status is displayed every 100 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 1000
Thinning: 10
Summary of All Samples
Mean SD MCSE ESS LB
beta0 -0.544001376 0.2305040 0.03535316 93.03421 -0.95266664
beta1 0.060270601 0.3458235 0.04387534 106.92244 -0.67629607
beta2 1.360903746 0.3086684 0.04838620 60.40225 0.76275725
beta3 -0.828532715 0.4864563 0.06323965 99.93646 -1.74744708
s2 0.134846805 0.1055559 0.01599649 68.64912 0.02549966
(...)
Summary of Stationary Samples
Mean SD MCSE ESS LB
beta0 -0.544001376 0.2305040 0.03535316 93.03421 -0.95266664
beta1 0.060270601 0.3458235 0.04387534 106.92244 -0.67629607
beta2 1.360903746 0.3086684 0.04838620 60.40225 0.76275725
beta3 -0.828532715 0.4864563 0.06323965 99.93646 -1.74744708
s2 0.134846805 0.1055559 0.01599649 68.64912 0.02549966
SAS PROC MCMC example in R: Logistic Regression Random-Effects Model(转)的更多相关文章
- SparkMLlib之 logistic regression源码分析
最近在研究机器学习,使用的工具是spark,本文是针对spar最新的源码Spark1.6.0的MLlib中的logistic regression, linear regression进行源码分析,其 ...
- 统计学习方法笔记 Logistic regression
logistic distribution 设X是连续随机变量,X服从逻辑斯谛分布是指X具有下列分布函数和密度函数: 式中,μ为位置参数,γ>0为形状参数. 密度函数是脉冲函数 分布函数是一条S ...
- Logistic Regression and Gradient Descent
Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...
- (四)Logistic Regression
1 线性回归 回归就是对已知公式的未知参数进行估计.线性回归就是对于多维空间中的样本点,用特征的线性组合去拟合空间中点的分布和轨迹,比如已知公式是y=a∗x+b,未知参数是a和b,利用多真实的(x,y ...
- 转载 Deep learning:四(logistic regression练习)
前言: 本节来练习下logistic regression相关内容,参考的资料为网页:http://openclassroom.stanford.edu/MainFolder/DocumentPage ...
- More 3D Graphics (rgl) for Classification with Local Logistic Regression and Kernel Density Estimates (from The Elements of Statistical Learning)(转)
This post builds on a previous post, but can be read and understood independently. As part of my cou ...
- Some 3D Graphics (rgl) for Classification with Splines and Logistic Regression (from The Elements of Statistical Learning)(转)
This semester I'm teaching from Hastie, Tibshirani, and Friedman's book, The Elements of Statistical ...
- R︱Softmax Regression建模 (MNIST 手写体识别和文档多分类应用)
本文转载自经管之家论坛, R语言中的Softmax Regression建模 (MNIST 手写体识别和文档多分类应用) R中的softmaxreg包,发自2016-09-09,链接:https:// ...
- Logistic Regression 算法向量化实现及心得
Author: 相忠良(Zhong-Liang Xiang) Email: ugoood@163.com Date: Sep. 23st, 2017 根据 Andrew Ng 老师的深度学习课程课后作 ...
随机推荐
- MySQL一个简单的存储过程demo
使用的工具是Navicat for MySQL. 首先创建一个学生表 mysql) ) ) auto_increment,age ) ) not null,primary key(s_no)); Qu ...
- IO流中的Stream相关对象
流无处不在,只要是关于到文件的输入.输出.更新等,关于IO流,项目中还是经常用到的,写log日志免不了要与其打交道,现在需要用到,就顺道好好回顾一下进行整理,首先是几个需要用到的类的说明,其实说简单点 ...
- Linux-粘滞位的使用
粘滞位(Stickybit),又称粘着位,是Unix文件系统权限的一个旗标.最常见的用法在目录上设置粘滞位, 也只能针对⽬录设置,对于⽂件⽆效.则设置了粘滞位后,只有目录内文件的所有者或者root才可 ...
- python——面向对象基础
概述 面向过程:根据业务逻辑从上到下写垒代码 函数式:将某功能代码封装到函数中,日后便无需重复编写,仅调用函数即可 面向对象:对函数进行分类和封装,让开发“更快更好更强...” 面向过程编程最易被初学 ...
- python——模块
一.导入模块 Python之所以应用越来越广泛,在一定程度上也依赖于其为程序员提供了大量的模块以供使用,如果想要使用模块,则需要导入.导入模块有一下几种方法: 1 import module 2 fr ...
- python——杂货铺
三目运算: >>> 1 if 5>3 else 0 1 >>> 1 if 5<3 else 0 0 深浅拷贝: 一.数字和字符串 对于 数字 和 字符串 ...
- 详解Session分布式共享(.NET CORE版)
一.前言&回顾 在上篇文章Session分布式共享 = Session + Redis + Nginx中,好多同学留言问了我好多问题,其中印象深刻的有:nginx挂了怎么办?采用Redis的S ...
- hadoop集群搭建--CentOS部署Hadoop服务
在了解了Hadoop的相关知识后,接下来就是Hadoop环境的搭建,搭建Hadoop环境是正式学习大数据的开始,接下来就开始搭建环境!我们用到环境为:VMware 12+CentOS6.4 hadoo ...
- seajs模块标识命名和解析规则
模块标识采用路径形式,但要注意与路径的区别.require.require.async的第一个参数是模块标识.而seajs.use第一个参数为文件路径. use是全局的,require是局部的.模块标 ...
- 解决U盘容量变小问题
今天又想重新给电脑刷刷kali linux新版本了貌似N久没更,直接重新刷系统吧...然后发现USB容量变小,这就尴尬了,接着总结了个小方法. 解决方法:1.先把u盘插好,运行cmd,2.输入disk ...