I want to give a quick tutorial on fitting Linear Mixed Models (hierarchical models) with a full variance-covariance matrix for random effects (what Barr et al 2013 call a maximal model) using Stan.

For a longer version of this tutorial, see: Sorensen, Hohenstein, Vasishth, 2016.

Prerequisites: You need to have R and preferably RStudio installed; RStudio is optional. You need to have rstan installed. See here. I am also assuming you have fit lmer models like these before:

lmer(log(rt) ~ 1+RCType+dist+int+(1+RCType+dist+int|subj) + (1+RCType+dist+int|item), dat)

If you don't know what the above code means, first read chapter 4 of my lecture notes.

The code and data format needed to fit LMMs in Stan

The data

I assume you have a 2x2 repeated measures design with some continuous measure like reading time (rt) data and want to do a main effects and interaction contrast coding. Let's say your main effects are RCType and dist, and the interaction is coded as int. All these contrast codings are ±1. If you don't know what contrast coding is, see these notes and read section 4.3 (although it's best to read the whole chapter). I am using an excerpt of an example data-set from Husain et al. 2014.

"subj" "item" "rt""RCType" "dist" "int"
1 14 438 -1 -1 1
1 16 531 1 -1 -1
1 15 422 1 1 1
1 18 1000 -1 -1 1
...

Assume that these data are stored in R as a data-frame with name rDat.

The Stan code

Copy the following Stan code into a text file and save it as the file matrixModel.stan. For continuous data like reading times or EEG, you never need to touch this file again. You will only ever specify the design matrix X and the structure of the data. The rest is all taken care of.

data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
vector[N] rt; //reading time
} parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
real<lower=0> sigma_e; //residual std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
} transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
} model {
//priors
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
}

Define the design matrix

Since we want to test the main effects coded as the columns RCType, dist, and int, our design matrix will look like this:

# Make design matrix
X <- unname(model.matrix(~ 1 + RCType + dist + int, rDat))
attr(X, "assign") <- NULL

Prepare data for Stan

Stan expects the data in a list form, not as a data frame (unlike lmer). So we set it up as follows:

# Make Stan data
stanDat <- list(N = nrow(X),
P = ncol(X),
n_u = ncol(X),
n_w = ncol(X),
X = X,
Z_u = X,
Z_w = X,
J = nlevels(rDat$subj),
K = nlevels(rDat$item),
rt = rDat$rt,
subj = as.integer(rDat$subj),
item = as.integer(rDat$item))

Load library rstan and fit Stan model

library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores()) # Fit the model
matrixFit <- stan(file = "matrixModel.stan", data = stanDat,
iter = 2000, chains = 4)

Examine posteriors

print(matrixFit)

This print output is overly verbose. I wrote a simple function to get the essential information quickly.

stan_results<-function(m,params=paramnames){
m_extr<-extract(m,pars=paramnames)
par_names<-names(m_extr)
means<-lapply(m_extr,mean)
quantiles<-lapply(m_extr,
function(x)quantile(x,probs=c(0.025,0.975)))
means<-data.frame(means)
quants<-data.frame(quantiles)
summry<-t(rbind(means,quants))
colnames(summry)<-c("mean","lower","upper")
summry
}

For example, if I want to see only the posteriors of the four beta parameters, I can write:

stan_results(matrixFit, params=c("beta[1]","beta[2]","beta[3]","beta[4]"))

For more details, such as interpreting the results and computing things like Bayes Factors, seeNicenboim and Vasishth 2016.

FAQ: What if I don't want to fit a lognormal?

In the Stan code above, I assume a lognormal function for the reading times:

 rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);

If this upsets you deeply and you want to use a normal distribution (and in fact, for EEG data this makes sense), go right ahead and change the lognormal to normal:

 rt[i] ~ normal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);

FAQ: What if I my dependent measure is binary (0,1) responses?

Use this Stan code instead of the one shown above. Here, I assume that you have a column called response in the data, which has 0,1 values. These are the trial level binary responses.

data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
int response[N]; //response
} parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
} transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
} model {
//priors
beta ~ cauchy(0,2.5);
sigma_u ~ cauchy(0,2.5);
sigma_w ~ cauchy(0,2.5);
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
response[i] ~ bernoulli_logit(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]]);
}

For reproducible example code

See here.

Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial的更多相关文章

  1. 混合线性模型(linear mixed models)

    一般线性模型.混合线性模型.广义线性模型 广义线性模型GLM很简单,举个例子,药物的疗效和服用药物的剂量有关.这个相关性可能是多种多样的,可能是简单线性关系(发烧时吃一片药退烧0.1度,两片药退烧0. ...

  2. [Sklearn] Linear regression models to fit noisy data

    Ref: [Link] sklearn各种回归和预测[各线性模型对噪声的反应] Ref: Linear Regression 实战[循序渐进思考过程] Ref: simple linear regre ...

  3. [ML] Bayesian Linear Regression

    热身预览 1.1.10. Bayesian Regression 1.1.10.1. Bayesian Ridge Regression 1.1.10.2. Automatic Relevance D ...

  4. 贝叶斯线性回归(Bayesian Linear Regression)

    贝叶斯线性回归(Bayesian Linear Regression) 2016年06月21日 09:50:40 Duanxx 阅读数 54254更多 分类专栏: 监督学习   版权声明:本文为博主原 ...

  5. [Bayesian] “我是bayesian我怕谁”系列 - Continuous Latent Variables

    打开prml and mlapp发现这部分目录编排有点小不同,但神奇的是章节序号竟然都为“十二”. prml:pca --> ppca --> fa mlapp:fa --> pca ...

  6. 机器学习理论基础学习17---贝叶斯线性回归(Bayesian Linear Regression)

    本文顺序 一.回忆线性回归 线性回归用最小二乘法,转换为极大似然估计求解参数W,但这很容易导致过拟合,由此引入了带正则化的最小二乘法(可证明等价于最大后验概率) 二.什么是贝叶斯回归? 基于上面的讨论 ...

  7. Popular generalized linear models|GLMM| Zero-truncated Models|Zero-Inflated Models|matched case–control studies|多重logistics回归|ordered logistics regression

    ============================================================== Popular generalized linear models 将不同 ...

  8. 最大似然估计实例 | Fitting a Model by Maximum Likelihood (MLE)

    参考:Fitting a Model by Maximum Likelihood 最大似然估计是用于估计模型参数的,首先我们必须选定一个模型,然后比对有给定的数据集,然后构建一个联合概率函数,因为给定 ...

  9. KDD2016,Accepted Papers

    RESEARCH TRACK PAPERS - ORAL Title & Authors NetCycle: Collective Evolution Inference in Heterog ...

随机推荐

  1. Git安装与上传代码至Github

    转载请注明出处:http://www.cnblogs.com/cnwutianhao/p/6642887.html 这篇文章应该是全网最新,最全,最靠谱的Github安装到上传代码的流程. 1.Git ...

  2. alert 和 console.log的区别

    出走半月,一直以为 console.log 和 alert 的用法是一样的,只是表现的形式不同,alert 是以弹框的形式出现,console.log 是在后台打印输出. 但是今天在写东西的时候,发现 ...

  3. 在Centos中yum安装和卸载软件的使用方法

    安装一个软件时 yum -y install httpd 安装多个相类似的软件时 yum -y install httpd* 安装多个非类似软件时 yum -y install httpd php p ...

  4. 一次young gc耗时过长优化过程

    1    问题源起 上游系统通过公司rpc框架调用我们系统接口超时(默认超时时间为100ms)数量从50次/分突然上涨到2000次/分,在发生变化时间段里我们的系统也没有做过代码变更,但上游系统的调用 ...

  5. spring项目log4j使用入门

    log4j是Java开发中经常使用的一个日志框架,功能强大,配置灵活,基本上可以满足项目开发中对日志功能的大部分需求.我前后经历了四五个项目,采用的日志框架都是log4j,这也反应了log4j受欢迎的 ...

  6. Xamarin.Forms+Prism(3)—— 简单提示UI的使用

    这次给大家介绍两个比较好用的提示插件,如成功.等待.错误提示. 准备: 1.新建一个Prism Xamarin.Forms项目: 2.右击解决方案,添加NuGet包: 1)Acr.UserDialog ...

  7. 想成为Java高级工程师的看过来

    想成为Java高级工程师,有哪些要求呢? 1.Core Java,就是Java基础.JDK的类库,很多童鞋都会说,JDK我懂,但是懂还不足够,知其然还要知其所以然,JDK的源代码写的非常好,要经常查看 ...

  8. HTTP长连接、短连接使用及测试

    概念 HTTP短连接(非持久连接)是指,客户端和服务端进行一次HTTP请求/响应之后,就关闭连接.所以,下一次的HTTP请求/响应操作就需要重新建立连接. HTTP长连接(持久连接)是指,客户端和服务 ...

  9. 用户登录(Material Design + Data-Binding + MVP架构模式)实现

    转载请注明出处: http://www.cnblogs.com/cnwutianhao/p/6772759.html MVP架构模式 大家都不陌生,Google 也给出过相应的参考 Sample, 但 ...

  10. 【珍藏】linux 同步IO: sync、fsync与fdatasync

    传统的UNIX实现在内核中设有缓冲区高速缓存或页面高速缓存,大多数磁盘I/O都通过缓冲进行.当将数据写入文件时,内核通常先将该数据复制到其中一个缓冲区中,如果该缓冲区尚未写满,则并不将其排入输出队列, ...