Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial
I want to give a quick tutorial on fitting Linear Mixed Models (hierarchical models) with a full variance-covariance matrix for random effects (what Barr et al 2013 call a maximal model) using Stan.
For a longer version of this tutorial, see: Sorensen, Hohenstein, Vasishth, 2016.
Prerequisites: You need to have R and preferably RStudio installed; RStudio is optional. You need to have rstan installed. See here. I am also assuming you have fit lmer models like these before:
lmer(log(rt) ~ 1+RCType+dist+int+(1+RCType+dist+int|subj) + (1+RCType+dist+int|item), dat)
If you don't know what the above code means, first read chapter 4 of my lecture notes.
The code and data format needed to fit LMMs in Stan
The data
I assume you have a 2x2 repeated measures design with some continuous measure like reading time (rt) data and want to do a main effects and interaction contrast coding. Let's say your main effects are RCType and dist, and the interaction is coded as int. All these contrast codings are ±1. If you don't know what contrast coding is, see these notes and read section 4.3 (although it's best to read the whole chapter). I am using an excerpt of an example data-set from Husain et al. 2014.
"subj" "item" "rt""RCType" "dist" "int"
1 14 438 -1 -1 1
1 16 531 1 -1 -1
1 15 422 1 1 1
1 18 1000 -1 -1 1
...
Assume that these data are stored in R as a data-frame with name rDat.
The Stan code
Copy the following Stan code into a text file and save it as the file matrixModel.stan. For continuous data like reading times or EEG, you never need to touch this file again. You will only ever specify the design matrix X and the structure of the data. The rest is all taken care of.
data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
vector[N] rt; //reading time
}
parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
real<lower=0> sigma_e; //residual std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
}
transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
}
model {
//priors
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
}
Define the design matrix
Since we want to test the main effects coded as the columns RCType, dist, and int, our design matrix will look like this:
# Make design matrix
X <- unname(model.matrix(~ 1 + RCType + dist + int, rDat))
attr(X, "assign") <- NULL
Prepare data for Stan
Stan expects the data in a list form, not as a data frame (unlike lmer). So we set it up as follows:
# Make Stan data
stanDat <- list(N = nrow(X),
P = ncol(X),
n_u = ncol(X),
n_w = ncol(X),
X = X,
Z_u = X,
Z_w = X,
J = nlevels(rDat$subj),
K = nlevels(rDat$item),
rt = rDat$rt,
subj = as.integer(rDat$subj),
item = as.integer(rDat$item))
Load library rstan and fit Stan model
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores()) # Fit the model
matrixFit <- stan(file = "matrixModel.stan", data = stanDat,
iter = 2000, chains = 4)
Examine posteriors
print(matrixFit)
This print output is overly verbose. I wrote a simple function to get the essential information quickly.
stan_results<-function(m,params=paramnames){
m_extr<-extract(m,pars=paramnames)
par_names<-names(m_extr)
means<-lapply(m_extr,mean)
quantiles<-lapply(m_extr,
function(x)quantile(x,probs=c(0.025,0.975)))
means<-data.frame(means)
quants<-data.frame(quantiles)
summry<-t(rbind(means,quants))
colnames(summry)<-c("mean","lower","upper")
summry
}
For example, if I want to see only the posteriors of the four beta parameters, I can write:
stan_results(matrixFit, params=c("beta[1]","beta[2]","beta[3]","beta[4]"))
For more details, such as interpreting the results and computing things like Bayes Factors, seeNicenboim and Vasishth 2016.
FAQ: What if I don't want to fit a lognormal?
In the Stan code above, I assume a lognormal function for the reading times:
rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
If this upsets you deeply and you want to use a normal distribution (and in fact, for EEG data this makes sense), go right ahead and change the lognormal to normal:
rt[i] ~ normal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
FAQ: What if I my dependent measure is binary (0,1) responses?
Use this Stan code instead of the one shown above. Here, I assume that you have a column called response in the data, which has 0,1 values. These are the trial level binary responses.
data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
int response[N]; //response
}
parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
}
transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
}
model {
//priors
beta ~ cauchy(0,2.5);
sigma_u ~ cauchy(0,2.5);
sigma_w ~ cauchy(0,2.5);
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
response[i] ~ bernoulli_logit(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]]);
}
For reproducible example code
See here.
Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial的更多相关文章
- 混合线性模型(linear mixed models)
一般线性模型.混合线性模型.广义线性模型 广义线性模型GLM很简单,举个例子,药物的疗效和服用药物的剂量有关.这个相关性可能是多种多样的,可能是简单线性关系(发烧时吃一片药退烧0.1度,两片药退烧0. ...
- [Sklearn] Linear regression models to fit noisy data
Ref: [Link] sklearn各种回归和预测[各线性模型对噪声的反应] Ref: Linear Regression 实战[循序渐进思考过程] Ref: simple linear regre ...
- [ML] Bayesian Linear Regression
热身预览 1.1.10. Bayesian Regression 1.1.10.1. Bayesian Ridge Regression 1.1.10.2. Automatic Relevance D ...
- 贝叶斯线性回归(Bayesian Linear Regression)
贝叶斯线性回归(Bayesian Linear Regression) 2016年06月21日 09:50:40 Duanxx 阅读数 54254更多 分类专栏: 监督学习 版权声明:本文为博主原 ...
- [Bayesian] “我是bayesian我怕谁”系列 - Continuous Latent Variables
打开prml and mlapp发现这部分目录编排有点小不同,但神奇的是章节序号竟然都为“十二”. prml:pca --> ppca --> fa mlapp:fa --> pca ...
- 机器学习理论基础学习17---贝叶斯线性回归(Bayesian Linear Regression)
本文顺序 一.回忆线性回归 线性回归用最小二乘法,转换为极大似然估计求解参数W,但这很容易导致过拟合,由此引入了带正则化的最小二乘法(可证明等价于最大后验概率) 二.什么是贝叶斯回归? 基于上面的讨论 ...
- Popular generalized linear models|GLMM| Zero-truncated Models|Zero-Inflated Models|matched case–control studies|多重logistics回归|ordered logistics regression
============================================================== Popular generalized linear models 将不同 ...
- 最大似然估计实例 | Fitting a Model by Maximum Likelihood (MLE)
参考:Fitting a Model by Maximum Likelihood 最大似然估计是用于估计模型参数的,首先我们必须选定一个模型,然后比对有给定的数据集,然后构建一个联合概率函数,因为给定 ...
- KDD2016,Accepted Papers
RESEARCH TRACK PAPERS - ORAL Title & Authors NetCycle: Collective Evolution Inference in Heterog ...
随机推荐
- javascript中的几种遍历方法浅析
1. for...in 用于对数组或者对象的属性的可枚举属性进行循环操作.注意该对象来自原型链上的可枚举属性也会被循环.下面看例子 var arr = ["lee","h ...
- PAT 1046
1046. Shortest Distance (20) The task is really simple: given N exits on a highway which forms a sim ...
- 你的计算机也可以看懂世界——十分钟跑起卷积神经网络(Windows+CPU)
众所周知,如果你想研究Deep Learning,那么比较常用的配置是Linux+GPU,不过现在很多非计算机专业的同学有时也会想采用Deep Learning方法来完成一些工作,那么Linux+GP ...
- yii2.0套用模板问题
载入视图 在控制器中: $this->render(); 会加载布局 $this->renderPartial(); 不会加载布局(也不能载入框架自带的jquery等) Yii2 选择布局 ...
- DHTMLX 修改方法加参数
dhtmlx下拉框选项过长,导致显示不全,所以在下拉框里加了title 具体方法如下: dhtmlXCombo.prototype.modes.checkbox.render=function(c, ...
- 获取camera截屏图片
Camera camera; SpriteRenderer sprRender; Texture2D t2d = New Texture2D(1300, 760, TextureFormat.RGB2 ...
- IOS语音录取
在IOS中,在做语音识别中,需要对语音进行抓取. #import "GetAudioViewController.h" #import <AVFoundation/AVFou ...
- tomcat的环境搭建
tomcat搭建过程还是比较简单的,只需要安装好jdk,然后配置好环境变量,最后把tomcat安装上开启就可以了. 首先下载jdk,然后把下载下来的jdk放到/usr/local下,然后用rpm -i ...
- CSS3弹性伸缩布局(下)——flex布局
新版本 新版本的flex布局模型是2012年9月提出的工作草案,这个草案是由W3C推出的最新语法,这个版本立志于指定标准,让新式的浏览器全面兼容,在未来的浏览器更新换代中实现统一. 目前几乎大部分的浏 ...
- PHP 序列化与反序列化函数
序列化与反序列化 把复杂的数据类型压缩到一个字符串中 serialize() 把变量和它们的值编码成文本形式 unserialize() 恢复原先变量 1.创建一个$arr数组用于储存用户基本信息 ...