I want to give a quick tutorial on fitting Linear Mixed Models (hierarchical models) with a full variance-covariance matrix for random effects (what Barr et al 2013 call a maximal model) using Stan.

For a longer version of this tutorial, see: Sorensen, Hohenstein, Vasishth, 2016.

Prerequisites: You need to have R and preferably RStudio installed; RStudio is optional. You need to have rstan installed. See here. I am also assuming you have fit lmer models like these before:

lmer(log(rt) ~ 1+RCType+dist+int+(1+RCType+dist+int|subj) + (1+RCType+dist+int|item), dat)

If you don't know what the above code means, first read chapter 4 of my lecture notes.

The code and data format needed to fit LMMs in Stan

The data

I assume you have a 2x2 repeated measures design with some continuous measure like reading time (rt) data and want to do a main effects and interaction contrast coding. Let's say your main effects are RCType and dist, and the interaction is coded as int. All these contrast codings are ±1. If you don't know what contrast coding is, see these notes and read section 4.3 (although it's best to read the whole chapter). I am using an excerpt of an example data-set from Husain et al. 2014.

"subj" "item" "rt""RCType" "dist" "int"
1 14 438 -1 -1 1
1 16 531 1 -1 -1
1 15 422 1 1 1
1 18 1000 -1 -1 1
...

Assume that these data are stored in R as a data-frame with name rDat.

The Stan code

Copy the following Stan code into a text file and save it as the file matrixModel.stan. For continuous data like reading times or EEG, you never need to touch this file again. You will only ever specify the design matrix X and the structure of the data. The rest is all taken care of.

data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
vector[N] rt; //reading time
} parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
real<lower=0> sigma_e; //residual std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
} transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
} model {
//priors
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
}

Define the design matrix

Since we want to test the main effects coded as the columns RCType, dist, and int, our design matrix will look like this:

# Make design matrix
X <- unname(model.matrix(~ 1 + RCType + dist + int, rDat))
attr(X, "assign") <- NULL

Prepare data for Stan

Stan expects the data in a list form, not as a data frame (unlike lmer). So we set it up as follows:

# Make Stan data
stanDat <- list(N = nrow(X),
P = ncol(X),
n_u = ncol(X),
n_w = ncol(X),
X = X,
Z_u = X,
Z_w = X,
J = nlevels(rDat$subj),
K = nlevels(rDat$item),
rt = rDat$rt,
subj = as.integer(rDat$subj),
item = as.integer(rDat$item))

Load library rstan and fit Stan model

library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores()) # Fit the model
matrixFit <- stan(file = "matrixModel.stan", data = stanDat,
iter = 2000, chains = 4)

Examine posteriors

print(matrixFit)

This print output is overly verbose. I wrote a simple function to get the essential information quickly.

stan_results<-function(m,params=paramnames){
m_extr<-extract(m,pars=paramnames)
par_names<-names(m_extr)
means<-lapply(m_extr,mean)
quantiles<-lapply(m_extr,
function(x)quantile(x,probs=c(0.025,0.975)))
means<-data.frame(means)
quants<-data.frame(quantiles)
summry<-t(rbind(means,quants))
colnames(summry)<-c("mean","lower","upper")
summry
}

For example, if I want to see only the posteriors of the four beta parameters, I can write:

stan_results(matrixFit, params=c("beta[1]","beta[2]","beta[3]","beta[4]"))

For more details, such as interpreting the results and computing things like Bayes Factors, seeNicenboim and Vasishth 2016.

FAQ: What if I don't want to fit a lognormal?

In the Stan code above, I assume a lognormal function for the reading times:

 rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);

If this upsets you deeply and you want to use a normal distribution (and in fact, for EEG data this makes sense), go right ahead and change the lognormal to normal:

 rt[i] ~ normal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);

FAQ: What if I my dependent measure is binary (0,1) responses?

Use this Stan code instead of the one shown above. Here, I assume that you have a column called response in the data, which has 0,1 values. These are the trial level binary responses.

data {
int<lower=0> N; //no trials
int<lower=1> P; //no fixefs
int<lower=0> J; //no subjects
int<lower=1> n_u; //no subj ranefs
int<lower=0> K; //no items
int<lower=1> n_w; //no item ranefs
int<lower=1,upper=j> subj[N]; //subject indicator
int<lower=1,upper=k> item[N]; //item indicator
row_vector[P] X[N]; //fixef design matrix
row_vector[n_u] Z_u[N]; //subj ranef design matrix
row_vector[n_w] Z_w[N]; //item ranef design matrix
int response[N]; //response
} parameters {
vector[P] beta; //fixef coefs
cholesky_factor_corr[n_u] L_u; //cholesky factor of subj ranef corr matrix
cholesky_factor_corr[n_w] L_w; //cholesky factor of item ranef corr matrix
vector<lower=0>[n_u] sigma_u; //subj ranef std
vector<lower=0>[n_w] sigma_w; //item ranef std
vector[n_u] z_u[J]; //spherical subj ranef
vector[n_w] z_w[K]; //spherical item ranef
} transformed parameters {
vector[n_u] u[J]; //subj ranefs
vector[n_w] w[K]; //item ranefs
{
matrix[n_u,n_u] Sigma_u; //subj ranef cov matrix
matrix[n_w,n_w] Sigma_w; //item ranef cov matrix
Sigma_u = diag_pre_multiply(sigma_u,L_u);
Sigma_w = diag_pre_multiply(sigma_w,L_w);
for(j in 1:J)
u[j] = Sigma_u * z_u[j];
for(k in 1:K)
w[k] = Sigma_w * z_w[k];
}
} model {
//priors
beta ~ cauchy(0,2.5);
sigma_u ~ cauchy(0,2.5);
sigma_w ~ cauchy(0,2.5);
L_u ~ lkj_corr_cholesky(2.0);
L_w ~ lkj_corr_cholesky(2.0);
for (j in 1:J)
z_u[j] ~ normal(0,1);
for (k in 1:K)
z_w[k] ~ normal(0,1);
//likelihood
for (i in 1:N)
response[i] ~ bernoulli_logit(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]]);
}

For reproducible example code

See here.

Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial的更多相关文章

  1. 混合线性模型(linear mixed models)

    一般线性模型.混合线性模型.广义线性模型 广义线性模型GLM很简单,举个例子,药物的疗效和服用药物的剂量有关.这个相关性可能是多种多样的,可能是简单线性关系(发烧时吃一片药退烧0.1度,两片药退烧0. ...

  2. [Sklearn] Linear regression models to fit noisy data

    Ref: [Link] sklearn各种回归和预测[各线性模型对噪声的反应] Ref: Linear Regression 实战[循序渐进思考过程] Ref: simple linear regre ...

  3. [ML] Bayesian Linear Regression

    热身预览 1.1.10. Bayesian Regression 1.1.10.1. Bayesian Ridge Regression 1.1.10.2. Automatic Relevance D ...

  4. 贝叶斯线性回归(Bayesian Linear Regression)

    贝叶斯线性回归(Bayesian Linear Regression) 2016年06月21日 09:50:40 Duanxx 阅读数 54254更多 分类专栏: 监督学习   版权声明:本文为博主原 ...

  5. [Bayesian] “我是bayesian我怕谁”系列 - Continuous Latent Variables

    打开prml and mlapp发现这部分目录编排有点小不同,但神奇的是章节序号竟然都为“十二”. prml:pca --> ppca --> fa mlapp:fa --> pca ...

  6. 机器学习理论基础学习17---贝叶斯线性回归(Bayesian Linear Regression)

    本文顺序 一.回忆线性回归 线性回归用最小二乘法,转换为极大似然估计求解参数W,但这很容易导致过拟合,由此引入了带正则化的最小二乘法(可证明等价于最大后验概率) 二.什么是贝叶斯回归? 基于上面的讨论 ...

  7. Popular generalized linear models|GLMM| Zero-truncated Models|Zero-Inflated Models|matched case–control studies|多重logistics回归|ordered logistics regression

    ============================================================== Popular generalized linear models 将不同 ...

  8. 最大似然估计实例 | Fitting a Model by Maximum Likelihood (MLE)

    参考:Fitting a Model by Maximum Likelihood 最大似然估计是用于估计模型参数的,首先我们必须选定一个模型,然后比对有给定的数据集,然后构建一个联合概率函数,因为给定 ...

  9. KDD2016,Accepted Papers

    RESEARCH TRACK PAPERS - ORAL Title & Authors NetCycle: Collective Evolution Inference in Heterog ...

随机推荐

  1. ecshop3.6商品如何按照销量排序

    ecshop订单状态对应值:order_status有5中状态,并且当客户确认收货后,order_status的数值不一定是1也有可能是5.order_status = 0表示订单未确认order_s ...

  2. Python 多进程概述

    multiprocessing python中的多线程其实并不是真正的多线程,如果想要充分地使用多核CPU的资源,在python中大部分情况需要使用多进程.Python提供了非常好用的多进程包mult ...

  3. C# 超高速高性能写日志 代码开源

    1.需求 需求很简单,就是在C#开发中高速写日志.比如在高并发,高流量的地方需要写日志.我们知道程序在操作磁盘时是比较耗时的,所以我们把日志写到磁盘上会有一定的时间耗在上面,这些并不是我们想看到的. ...

  4. 小结:Swift、OC语言中多target在代码中如何区分

    一.对swift工程 经实践,网上的方法都无法成功,后来思考DEBUG宏定义方式,经实测有效,方式如下: 注意:不能把swift flags 小三角折叠后双击设置-DTarget4AppStore, ...

  5. 使用python发送QQ邮件

    这里用到了Python的两个包来发送邮件: smtplib 和 email . Python 的 email 模块里包含了许多实用的邮件格式设置函数,可以用来创建邮件“包裹”.使用的 MIMEText ...

  6. JavaEE开发之SpringMVC中的路由配置及参数传递详解

    在之前我们使用Swift的Perfect框架来开发服务端程序时,聊到了Perfect中的路由配置.而在SpringMVC中的路由配置与其也是大同小异的.说到路由,其实就是将URL映射到Java的具体类 ...

  7. PHP站内搜索

    1.SQL语句中的模糊查找 LIKE条件一般用在指定搜索某字段的时候, 通过"% 或_" 通配符的作用实现模糊查找功能,通配符可以在前面也可以在后面或前后都有. 搜索以PHP100 ...

  8. T_SQL编程赋值、分支语句、循环

    咱们在C#中会常用到赋值.循环.分支语句什么的 今天咱们来看下当初在C#用到的一点东西放到SQL中是怎么使用的 创建变量 在C#中创建一个值类型变量很简单 int a:这就可以了 SQL: decla ...

  9. 财付通API

    开发财付通API的步骤: 1.首先开发财付通API时先获取商户号和密钥: 财付通测试号:商户号String partner = "1900000109";密钥String key ...

  10. Jmeter察看结果树的响应数据中的中文显示乱码问题处理

    1.Jmeter的察看结果树的响应数据有中文时会显示乱码,如图,我访问百度HTTP请求,响应数据中的title处是一串乱码 2.我们需要改一个设置,打开jmeter\bin\jmeter.proper ...