In this post I will run SAS example Logistic Regression Random-Effects Model in four R based solutions; Jags, STAN, MCMCpack and LaplacesDemon. To quote the SAS manual: 'The data are taken from Crowder (1978). The Seeds data set is a 2 x 2 factorial layout, with two types of seeds, O. aegyptiaca 75 and O. aegyptiaca 73, and two root extracts, bean and cucumber. You observe r, which is the number of germinated seeds, and n, which is the total number of seeds. The independent variables are seed and extract.'
The point of this exercise is to demonstrate a random effect. Each observation has a random effect associated with it. In contrast the other parameters have non-informative priors. As such, the models are not complex.  
Previous post in the series PROC MCMC examples programmed in R were: example 1: samplingexample 2: Box Cox transformation,example 5: Poisson regression and example 6: Non-Linear Poisson Regression.

JAGS

To make things easy on myself I wondered if this data was already present as R data. That is when I discovered this post at Grimwisdom doing exactly what I wanted as JAGS program. Hence that code was the basis for this JAGS code.
One thing on the models and distributions. JAGS uses tau (1/variance) as hyperparameter for the delta parameters and tau has gamma distribution. In contrast, all other programs use standard deviation as hyperparameter for delta parameter and hence gets the inverse gamma distribution. This distribution is available in the MCMCpack package and also in STAN. Gelman has a publication on this kind of priors:Prior distributions for variance parameters in hierarchical models.

library(R2jags)
data(orob2,package='aod')
seedData <- list ( N = 21,
r = orob2$y,
n = orob2$n,
x1 = c(1,0)[orob2$seed],
x2 = c(0,1)[orob2$root]
) modelRandomEffect <- function() {
for(i in 1:4) {alpha[i] ~ dnorm(0.0,1.0E-6)}
for(i in 1:N) {delta[i] ~ dnorm(0.0,tau)} for (i in 1:N) {
logit(p[i]) <- alpha[1] +
alpha[2]*x1[i] +
alpha[3]*x2[i] +
alpha[4]*x1[i]*x2[i] +
delta[i];
r[i] ~ dbin(p[i],n[i]);
}
tau ~ dgamma(.01,0.01)
s2 <- 1/tau
}
params <- c('alpha','s2')
myj <-jags(model=modelRandomEffect,
data = seedData,
parameters=params)
myj

Inference for Bugs model at "/tmp/RtmpKURJBm/model70173774d0c.txt", fit using jags,
 3 chains, each with 2000 iterations (first 1000 discarded)
 n.sims = 3000 iterations saved
         mu.vect sd.vect   2.5%    25%     50%     75%   97.5%  Rhat n.eff
alpha[1]  -0.538   0.191 -0.907 -0.664  -0.542  -0.419  -0.135 1.005   440
alpha[2]   0.078   0.300 -0.550 -0.111   0.083   0.269   0.646 1.005   440
alpha[3]   1.342   0.284  0.768  1.164   1.340   1.524   1.895 1.004  1400
alpha[4]  -0.807   0.441 -1.663 -1.098  -0.817  -0.521   0.046 1.009   300
s2         0.105   0.096  0.010  0.041   0.077   0.138   0.366 1.051    47
deviance 101.236   6.653 89.853 96.595 100.903 105.313 114.371 1.022    95

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 21.7 and DIC = 122.9
DIC is an estimate of expected predictive error (lower deviance is better).

MCMCpack

MCMCpack had some difficulty with this particular prior for s2. In the end I chose inverse Gamma(0.1,0.1) that worked. Therefor parameters estimates turn out slightly different. For conciseness only the first parameters are displayed in the output.

library(MCMCpack)
xmat <- cbind(rep(1,21),seedData$x1,seedData$x2,seedData$x1*seedData$x2)
MCMCfun <- function(parm) {
beta=parm[1:4]
s2=parm[5]
delta=parm[5+(1:21)]
if(s2<0 ) return(-Inf)
step1 <- xmat %*% beta + delta
p <- LaplacesDemon::invlogit(step1)
LL <- sum(dbinom(seedData$r,seedData$n,p,log=TRUE))
prior <- sum(dnorm(beta,0,1e3,log=TRUE))+
sum(dnorm(delta,0,sqrt(s2),log=TRUE))+
log(dinvgamma(s2,.1,.1))
LP=LL+prior
return(LP)
}
inits <- c(rnorm(4,0,1),runif(1,0,1),rnorm(21,0,1))
names(inits) <- c(paste('beta',0:3,sep=''),
's2',
paste('delta',1:21,sep=''))
mcmcout <- MCMCmetrop1R(MCMCfun,
inits)
summary(mcmcout)

Iterations = 501:20500
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 20000

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

Mean      SD  Naive SE Time-series SE
 [1,] -0.59046 0.30456 0.0021535        0.03613
 [2,]  0.01476 0.47526 0.0033606        0.04866
 [3,]  1.48129 0.38227 0.0027031        0.03883
 [4,] -0.93754 0.66414 0.0046962        0.07025
 [5,]  0.35146 0.08004 0.0005659        0.05020

STAN

Stan had no problems at all with this model.

library(rstan)
smodel <- '
data {
int <lower=1> N;
int n[N];
int r[N];
real x1[N];
real x2[N];
}
parameters {
real Beta[4];
real <lower=0> s2;
real Delta[N];
}
transformed parameters {
vector[N] mu;
for (i in 1:N) {
mu[i] <- inv_logit(
Beta[1] + Beta[2]*x1[i] +
Beta[3]*x2[i]+Beta[4]*x1[i]*x2[i]+
Delta[i]);
}
}
model {
r ~ binomial(n,mu);
s2 ~ inv_gamma(.01,.01);
Delta ~ normal(0,sqrt(s2));
Beta ~ normal(0,1000);
}
'
fstan <- stan(model_code = smodel,
data = seedData,
pars=c('Beta','s2'))
fstan

Inference for Stan model: smodel.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
Beta[1]   -0.56    0.01 0.20   -0.97   -0.68   -0.55   -0.43   -0.14  1370 1.00
Beta[2]    0.08    0.01 0.33   -0.58   -0.12    0.08    0.29    0.72  1385 1.00
Beta[3]    1.36    0.01 0.29    0.83    1.18    1.36    1.54    1.97  1355 1.00
Beta[4]   -0.83    0.01 0.46   -1.76   -1.12   -0.82   -0.53    0.04  1434 1.00
s2         0.12    0.00 0.11    0.02    0.06    0.10    0.16    0.42   588 1.01
lp__    -523.70    0.34 6.86 -537.26 -528.19 -523.73 -519.21 -509.87   403 1.01

Samples were drawn using NUTS(diag_e) at Sat Jan 17 18:27:38 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

LaplacesDemon

While in the MCMCpack code I borrowed from LaplacesDemon the invlogit() function, in LaplacesDemon I borrow the InvGamma distribution. I guess it evens out. For a change no further tweaking in the code. Note that the core likelihood function is the same as MCMCpack. However, LaplacesDemon is able to use the correct prior. For brevity again part of the output has not been copied in the blog.

library('LaplacesDemon')
mon.names <- "LP"
parm.names <- c(paste('beta',0:3,sep=''),
's2',
paste('delta',1:21,sep=''))
PGF <- function(Data) {
x <-c(rnorm(21+4+1,0,1))
x[5] <- runif(1,0,2)
x
}
MyData <- list(mon.names=mon.names,
parm.names=parm.names,
PGF=PGF,
xmat = xmat,
r=seedData$r,
n=seedData$n)
N<-1
Model <- function(parm, Data)
{
beta=parm[1:4]
s2=parm[5]
delta=parm[5+(1:21)]
# if(s2<0 ) return(-Inf)
step1 <- xmat %*% beta + delta
p <- invlogit(step1)
LL <- sum(dbinom(seedData$r,seedData$n,p,log=TRUE))
tau <- 1/s2
prior <- sum(dnorm(beta,0,1e3,log=TRUE))+
sum(dnorm(delta,0,sqrt(s2),log=TRUE))+
log(dinvgamma(s2,.01,.01))
LP=LL+prior
Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
yhat=p,
parm=parm)
return(Modelout)
}
Initial.Values <- GIV(Model, MyData, PGF=TRUE)
Fit1 <- LaplacesDemon(Model,
Data=MyData,
Initial.Values = Initial.Values
)
Fit1

Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values)

Acceptance Rate: 0.67594
Algorithm: Metropolis-within-Gibbs
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
   beta0    beta1    beta2    beta3       s2   delta1   delta2   delta3 
0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 
  delta4   delta5   delta6   delta7   delta8   delta9  delta10  delta11 
0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 
 delta12  delta13  delta14  delta15  delta16  delta17  delta18  delta19 
0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 0.218082 
 delta20  delta21 
0.218082 0.218082

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
         All Stationary
Dbar 100.063    100.063
pD    26.630     26.630
DIC  126.692    126.692
Initial Values:
 [1]  1.385256609 -0.634946833  1.456635236 -0.041162276  1.883504417
 [6] -1.380783003 -0.688367493  0.210060822  0.127231904  0.710367572
[11] -0.865780359 -1.649760777 -0.005532662 -0.114739142  0.642440639
[16] -0.919494616 -0.829018195 -0.938486769  0.302152995 -1.877933490
[21]  1.170542660  0.131282852  0.210852443  0.808779058 -2.115209547
[26]  0.431205368

Iterations: 10000
Log(Marginal Likelihood): -27.92817
Minutes of run-time: 0.81
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 26
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 0
Recommended Burn-In of Un-thinned Samples: 0
Recommended Thinning: 240
Specs: (NOT SHOWN HERE)
Status is displayed every 100 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 1000
Thinning: 10

Summary of All Samples
                  Mean        SD       MCSE       ESS            LB
beta0     -0.544001376 0.2305040 0.03535316  93.03421   -0.95266664
beta1      0.060270601 0.3458235 0.04387534 106.92244   -0.67629607
beta2      1.360903746 0.3086684 0.04838620  60.40225    0.76275725
beta3     -0.828532715 0.4864563 0.06323965  99.93646   -1.74744708
s2         0.134846805 0.1055559 0.01599649  68.64912    0.02549966
(...)
Summary of Stationary Samples
                  Mean        SD       MCSE       ESS            LB
beta0     -0.544001376 0.2305040 0.03535316  93.03421   -0.95266664
beta1      0.060270601 0.3458235 0.04387534 106.92244   -0.67629607
beta2      1.360903746 0.3086684 0.04838620  60.40225    0.76275725
beta3     -0.828532715 0.4864563 0.06323965  99.93646   -1.74744708
s2         0.134846805 0.1055559 0.01599649  68.64912    0.02549966

SAS PROC MCMC example in R: Logistic Regression Random-Effects Model(转)的更多相关文章

  1. SparkMLlib之 logistic regression源码分析

    最近在研究机器学习,使用的工具是spark,本文是针对spar最新的源码Spark1.6.0的MLlib中的logistic regression, linear regression进行源码分析,其 ...

  2. 统计学习方法笔记 Logistic regression

    logistic distribution 设X是连续随机变量,X服从逻辑斯谛分布是指X具有下列分布函数和密度函数: 式中,μ为位置参数,γ>0为形状参数. 密度函数是脉冲函数 分布函数是一条S ...

  3. Logistic Regression and Gradient Descent

    Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...

  4. (四)Logistic Regression

    1 线性回归 回归就是对已知公式的未知参数进行估计.线性回归就是对于多维空间中的样本点,用特征的线性组合去拟合空间中点的分布和轨迹,比如已知公式是y=a∗x+b,未知参数是a和b,利用多真实的(x,y ...

  5. 转载 Deep learning:四(logistic regression练习)

    前言: 本节来练习下logistic regression相关内容,参考的资料为网页:http://openclassroom.stanford.edu/MainFolder/DocumentPage ...

  6. More 3D Graphics (rgl) for Classification with Local Logistic Regression and Kernel Density Estimates (from The Elements of Statistical Learning)(转)

    This post builds on a previous post, but can be read and understood independently. As part of my cou ...

  7. Some 3D Graphics (rgl) for Classification with Splines and Logistic Regression (from The Elements of Statistical Learning)(转)

    This semester I'm teaching from Hastie, Tibshirani, and Friedman's book, The Elements of Statistical ...

  8. R︱Softmax Regression建模 (MNIST 手写体识别和文档多分类应用)

    本文转载自经管之家论坛, R语言中的Softmax Regression建模 (MNIST 手写体识别和文档多分类应用) R中的softmaxreg包,发自2016-09-09,链接:https:// ...

  9. Logistic Regression 算法向量化实现及心得

    Author: 相忠良(Zhong-Liang Xiang) Email: ugoood@163.com Date: Sep. 23st, 2017 根据 Andrew Ng 老师的深度学习课程课后作 ...

随机推荐

  1. HTML5基础学习

    分享一下html5的一些基础,小白上路! 一.html5基本结构 <!DOCTYPE html> ↑声明文档类型为HTML5文件. 文档声明,在HTML文档必不可少.且必须放在文档第一行 ...

  2. Redis分布式锁

    Redis分布式锁 分布式锁是许多环境中非常有用的原语,其中不同的进程必须以相互排斥的方式与共享资源一起运行. 有许多图书馆和博客文章描述了如何使用Redis实现DLM(分布式锁管理器),但是每个库都 ...

  3. 微信小程序,前端大梦想(三)

    微信小程序的事件及生命周期   继续下节课,今天我们还是从四个方面来了解小程序:     ●常用事件和事件冒泡   ●配置   ●app生命周期及app对象的使用   ●页面的生命周期   一.事件的 ...

  4. js继承之原型链方式实现

    温故而知新: 在之前的文章已经重点了解了原型对象,今天就理一理这个原型对象在原型链式的继承中的使用 function a(x,y){this.x=x;this.y=y;} //定义一个函数,当构造函数 ...

  5. shell脚本基本知识点

    Shell 是一个用C语言编写的程序,它是用户使用Linux的桥梁.用户通过这个界面访问Linux操作系统内核的服务.Shell既是一种命令语言,又是一种程序设计语言. 1.Shell 环境 Shel ...

  6. html 选择器之基础选择器

    我把CSS选择器分开成三部分,第一部分是我们常用的部分,我把他叫做基本选择器:第二部分我把他称作是属性选择器,第三部分我把他称作伪类选择器 一.基础选择器 1. 通配符(*):选中所有的元素 2.元素 ...

  7. [编织消息框架][网络IO模型]BIO

    既然跟网络内容有关就不得不学习网络IO模型,时代在进步,技术也在进步,采取使用那种网络IO模型就已经确定应用程序规模 阻塞IO(blocking IO) 在linux中,默认情况下所有的socket都 ...

  8. java抽象类和抽象方法之间的关系

    抽象类和抽象方法之间的关系有抽象方法的类,一定是抽象类:抽象类不一定有抽象方法当子类继承抽象类时,必须要将抽象类中的抽象方法全部实现(或者称为重写),否则子类依然是抽象类因为子类是继承父类全部内容,所 ...

  9. C#Execl

    using System.IO; using System.Text; namespace iLIS.Common { /// <summary> /// 生成Excel文档内容 /// ...

  10. 比较Java中几个常用集合添加元素的效率

    初始化需要进行比较的集合,统一增加10万个元素,获取整个过程的执行时间. 1.List集合增加元素 private static void testList() { List<Integer&g ...