Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial
I want to give a quick tutorial on fitting Linear Mixed Models (hierarchical models) with a full variance-covariance matrix for random effects (what Barr et al 2013 call a maximal model) using Stan.
For a longer version of this tutorial, see: Sorensen, Hohenstein, Vasishth, 2016.
Prerequisites: You need to have R and preferably RStudio installed; RStudio is optional. You need to have rstan installed. See here. I am also assuming you have fit lmer models like these before:
lmer(log(rt) ~ 1+RCType+dist+int+(1+RCType+dist+int|subj) + (1+RCType+dist+int|item), dat)
If you don't know what the above code means, first read chapter 4 of my lecture notes.
The code and data format needed to fit LMMs in Stan
The data
I assume you have a 2x2 repeated measures design with some continuous measure like reading time (rt) data and want to do a main effects and interaction contrast coding. Let's say your main effects are RCType and dist, and the interaction is coded as int. All these contrast codings are ±1. If you don't know what contrast coding is, see these notes and read section 4.3 (although it's best to read the whole chapter). I am using an excerpt of an example data-set from Husain et al. 2014.
"subj" "item" "rt""RCType" "dist" "int"
1 14 438 -1 -1 1
1 16 531 1 -1 -1
1 15 422 1 1 1
1 18 1000 -1 -1 1
...
Assume that these data are stored in R as a data-frame with name rDat.
The Stan code
Copy the following Stan code into a text file and save it as the file matrixModel.stan. For continuous data like reading times or EEG, you never need to touch this file again. You will only ever specify the design matrix X and the structure of the data. The rest is all taken care of.
data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
vector[N] rt; //reading time
}
parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
real<lower=0> sigma_e; //residual std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
}
transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
}
model {
//priors
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
}
Define the design matrix
Since we want to test the main effects coded as the columns RCType, dist, and int, our design matrix will look like this:
# Make design matrix
X <- unname(model.matrix(~ 1 + RCType + dist + int, rDat))
attr(X, "assign") <- NULL
Prepare data for Stan
Stan expects the data in a list form, not as a data frame (unlike lmer). So we set it up as follows:
# Make Stan data
stanDat <- list(N = nrow(X),
P = ncol(X),
n_u = ncol(X),
n_w = ncol(X),
X = X,
Z_u = X,
Z_w = X,
J = nlevels(rDat$subj),
K = nlevels(rDat$item),
rt = rDat$rt,
subj = as.integer(rDat$subj),
item = as.integer(rDat$item))
Load library rstan and fit Stan model
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores()) # Fit the model
matrixFit <- stan(file = "matrixModel.stan", data = stanDat,
iter = 2000, chains = 4)
Examine posteriors
print(matrixFit)
This print output is overly verbose. I wrote a simple function to get the essential information quickly.
stan_results<-function(m,params=paramnames){
m_extr<-extract(m,pars=paramnames)
par_names<-names(m_extr)
means<-lapply(m_extr,mean)
quantiles<-lapply(m_extr,
function(x)quantile(x,probs=c(0.025,0.975)))
means<-data.frame(means)
quants<-data.frame(quantiles)
summry<-t(rbind(means,quants))
colnames(summry)<-c("mean","lower","upper")
summry
}
For example, if I want to see only the posteriors of the four beta parameters, I can write:
stan_results(matrixFit, params=c("beta[1]","beta[2]","beta[3]","beta[4]"))
For more details, such as interpreting the results and computing things like Bayes Factors, seeNicenboim and Vasishth 2016.
FAQ: What if I don't want to fit a lognormal?
In the Stan code above, I assume a lognormal function for the reading times:
rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
If this upsets you deeply and you want to use a normal distribution (and in fact, for EEG data this makes sense), go right ahead and change the lognormal to normal:
rt[i] ~ normal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
FAQ: What if I my dependent measure is binary (0,1) responses?
Use this Stan code instead of the one shown above. Here, I assume that you have a column called response in the data, which has 0,1 values. These are the trial level binary responses.
data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
int response[N]; //response
}
parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
}
transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
}
model {
//priors
beta ~ cauchy(0,2.5);
sigma_u ~ cauchy(0,2.5);
sigma_w ~ cauchy(0,2.5);
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
response[i] ~ bernoulli_logit(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]]);
}
For reproducible example code
See here.
Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial的更多相关文章
- 混合线性模型(linear mixed models)
一般线性模型.混合线性模型.广义线性模型 广义线性模型GLM很简单,举个例子,药物的疗效和服用药物的剂量有关.这个相关性可能是多种多样的,可能是简单线性关系(发烧时吃一片药退烧0.1度,两片药退烧0. ...
- [Sklearn] Linear regression models to fit noisy data
Ref: [Link] sklearn各种回归和预测[各线性模型对噪声的反应] Ref: Linear Regression 实战[循序渐进思考过程] Ref: simple linear regre ...
- [ML] Bayesian Linear Regression
热身预览 1.1.10. Bayesian Regression 1.1.10.1. Bayesian Ridge Regression 1.1.10.2. Automatic Relevance D ...
- 贝叶斯线性回归(Bayesian Linear Regression)
贝叶斯线性回归(Bayesian Linear Regression) 2016年06月21日 09:50:40 Duanxx 阅读数 54254更多 分类专栏: 监督学习 版权声明:本文为博主原 ...
- [Bayesian] “我是bayesian我怕谁”系列 - Continuous Latent Variables
打开prml and mlapp发现这部分目录编排有点小不同,但神奇的是章节序号竟然都为“十二”. prml:pca --> ppca --> fa mlapp:fa --> pca ...
- 机器学习理论基础学习17---贝叶斯线性回归(Bayesian Linear Regression)
本文顺序 一.回忆线性回归 线性回归用最小二乘法,转换为极大似然估计求解参数W,但这很容易导致过拟合,由此引入了带正则化的最小二乘法(可证明等价于最大后验概率) 二.什么是贝叶斯回归? 基于上面的讨论 ...
- Popular generalized linear models|GLMM| Zero-truncated Models|Zero-Inflated Models|matched case–control studies|多重logistics回归|ordered logistics regression
============================================================== Popular generalized linear models 将不同 ...
- 最大似然估计实例 | Fitting a Model by Maximum Likelihood (MLE)
参考:Fitting a Model by Maximum Likelihood 最大似然估计是用于估计模型参数的,首先我们必须选定一个模型,然后比对有给定的数据集,然后构建一个联合概率函数,因为给定 ...
- KDD2016,Accepted Papers
RESEARCH TRACK PAPERS - ORAL Title & Authors NetCycle: Collective Evolution Inference in Heterog ...
随机推荐
- __init__.py
python中的Module是比较重要的概念.常见的情况是,事先写好一个.py文 件,在另一个文件中需要import时,将事先写好的.py文件拷贝 到当前目录,或者是在sys.path中增加事先写好的 ...
- Java线程池ThreadPoolExecutor使用和分析(三) - 终止线程池原理
相关文章目录: Java线程池ThreadPoolExecutor使用和分析(一) Java线程池ThreadPoolExecutor使用和分析(二) - execute()原理 Java线程池Thr ...
- 【原创101】Servlet精细笔记
一.什么Servlet? servlet 是运行在 Web 服务器中的小型 Java 程序(即:服务器端的小应用程序).servlet 通常通过 HTTP(超文本传输协议)接收和响应来自 Web 客户 ...
- PHP常用的三种设计模式
本文为大家介绍常用的三种php设计模式:单例模式.工厂模式.观察者模式,有需要的朋友可以参考下. 一.首先来看,单例模式 所谓单例模式,就是确保某个类只有一个实例,而且自行实例化并向整个系统提供这个实 ...
- 五子棋AI大战OC实现
Gobang 五子棋AI大战,该项目主要用到MVC框架,用算法搭建AI实现进攻或防守 一.项目介绍 1.地址: github地址:Gobang 2.效果图: 二.思路介绍 大概说下思路,具体看代码实现 ...
- JQuery 根据ID在页面中定位
1.锚点跳转简介 锚点其实就是可以让页面定位到某个位置上的点.在高度较高的页面中经常见到.比如百度的百科页面,wiki中的page内容. 我知道实现锚点的跳转有两种形式,一种是a标签+name属性:还 ...
- mysql导入导出sql文件(包括数据库和数据表的操作)
废话不多说直接开始. 在windows命令行下登录mysql,创建一个test_01数据库,创建一个user表,并插入一条数据,如下 一.导出数据库test_01 1.退出数据库,在命令行中输入 my ...
- poj2352树状数组
Astronomers often examine star maps where stars are represented by points on a plane and each star h ...
- 关于WebGIS开源解决方案的探讨(转载)
1.背景 公司目前的多数项目采用的是ArcGIS产品+Oracle+WebLogic/Tomcat/APUSIC/WebShpere这样的架构.由于 公司从事的是政府项目,甲方单位普遍均采购有以上产品 ...
- jquery与js实现全选功能的区别---2017-05-12
一.jquery常用的事件 click(),dbclick() focus(),blur() change() keydown(),keypress(),keyup() mousedown(),mou ...