A function to help graphical model checks of lm and ANOVA(转)
As always a more colourful version of this post is available on rpubs.
Even if LM are very simple models at the basis of many more complex ones, LM still have some assumptions that if not met would render any interpretation from the models plainly wrong. In my field of research most people were taught about checking ANOVA assumptions using tests like Levene & co. This is however not the best way to check if my model meet its assumptions as p-values depend on the sample size, with small sample size we will almost never reject the null hypothesis while with big sample even small deviation will lead to significant p-values (discussion). As ANOVA and linear models are two different ways to look at the same model (explanation) we can check ANOVA assumptions using graphical check from a linear model. In R this is easily done using plot(model), but people often ask me what amount of deviation makes me reject a model. One easy way to see if the model checking graphs are off the charts is to simulate data from the model, fit the model to these newly simulated data and compare the graphical checks from the simulated data with the real data. If you cannot differentiate between the simulated and the real data then your model is fine, if you can then try again!
Below is a little function that implement this idea:
lm.test<-function(m require(plyr)
#the model frame
dat<-model.frame(m)
#the model matrix
f<-formula(m)
modmat<-model.matrix(f,dat)
#the standard deviation of the residuals
sd.resid<-sd(resid(m #sample size
n<-dim(dat)[1]
#get the right-hand side of the formula
#rhs<-all.vars(update(f, 0~.))
#simulate 8 response vectors from model
ys<-lapply(1:8,function(x) rnorm(n,modmat%*%coef(m),sd.resid))
#refit the models
ms<-llply(ys,function(y) lm(y~modmat[,-1]))
#put the residuals and fitted values in a list
df<-llply(ms,function(x) data.frame(Fitted=fitted(x),Resid=resid(x)))
#select a random number from 2 to 8
rnd<-sample(2:8,1)
#put the original data into the list
df<-c(df[1:(rnd-1)],list(data.frame(Fitted=fitted(m),Resid=resid(m))),df[rnd:8]) #plot
par(mfrow=c(3,3))
l_ply(df,function(x){
plot(Resid~Fitted,x,xlab="Fitted",ylab="Residuals")
abline(h=0,lwd=2,lty=2)
}) l_ply(df,function(x){
qqnorm(x$Resid)
qqline(x$Resid)
}) out<-list(Position=rnd)
return(out)
}
This function print the two basic plots: one looking at the spread of the residuals around the fitted values, the other one look at the normality of the residuals. The function return the position of the real model in the 3×3 window, counting from left to right and from top to bottom (ie position 1 is upper left graph).
Let’s try the function:
#a simulated data frame of independent variables
dat<-data.frame(Temp=runif(100,0,20),Treatment=gl(n = 5,k = 20))
contrasts(dat$Treatment)<-"contr.sum"
#the model matrix
modmat<-model.matrix(~Temp*Treatment,data=dat)
#the coefficient
coeff<-rnorm(10,0,4)
#simulate response data
dat$Biomass<-rnorm(100,modmat%*%coeff,1)
#the model
m<-lm(Biomass~Temp*Treatment,dat)
#model check
chk<-lm.test(m)


Can you find which one is the real one? I could not, here is the answer:
chk
$Position
[1] 4
Happy and safe modelling!
A function to help graphical model checks of lm and ANOVA(转)的更多相关文章
- PGM:概率图模型Graphical Model
http://blog.csdn.net/pipisorry/article/details/51461878 概率图模型Graphical Models简介 完全通过代数计算来对更加复杂的模型进行建 ...
- 概率图模型(PGM,Probabilistic Graphical Model)
PGM是现代信号处理(尤其是机器学习)的重要内容. PGM通过图的方式,将多个随机变量之前的关系通过简洁的方式表现出来.因此PGM包括图论和概率论的相关内容. PGM理论研究并解决三个问题: 1)表示 ...
- [zz] 混合高斯模型 Gaussian Mixture Model
聚类(1)——混合高斯模型 Gaussian Mixture Model http://blog.csdn.net/jwh_bupt/article/details/7663885 聚类系列: 聚类( ...
- 构建自己的PHP框架--实现Model类(1)
在之前的博客中,我们定义了ORM的接口,以及决定了使用PDO去实现.最后我们提到会有一个Model类实现ModelInterface接口. 现在我们来实现这个接口,如下: <?php names ...
- Implementation Model Editor of AVEVA in OpenSceneGraph
Implementation Model Editor of AVEVA in OpenSceneGraph eryar@163.com 摘要Abstract:本文主要对工厂和海工设计软件AVEVA的 ...
- 【再探backbone 01】模型-Model
前言 点保存时候不注意发出来了,有需要的朋友将就看吧,还在更新...... 几个月前学习了一下backbone,这段时间也用了下,感觉之前对backbone的学习很是基础,前几天有个园友问我如何将路由 ...
- PHP MVC 中的MODEL层
Model层,就是MVC模式中的数据处理层,用来进行数据和商业逻辑的装封 三.实现你的Mode层 Model层,就是MVC模式中的数据处理层,用来进行数据和商业逻辑的装封,进行他的设计的时候设计到三个 ...
- hdwiki中model模块的应用
control中调用model原则是这样的,如果你的这个model在本control中大部分方法中都要用到,那么,就写在构造函数里面.例如,名字为doc的control的构造函数如下: functio ...
- [Backbone.js]如何处理Model里面嵌入的Collection?
写了近半个月的backbone.js代码,从一开始的todo到现在做仿微信的网页聊天,其中最大的困惑就在于如何处理比较复杂的Model,其内嵌了一个或者多个Collections. 假设我们有一个Pe ...
随机推荐
- Python爬虫入门 Urllib库的基本使用
1.分分钟扒一个网页下来 怎样扒网页呢?其实就是根据URL来获取它的网页信息,虽然我们在浏览器中看到的是一幅幅优美的画面,但是其实是由浏览器解释才呈现出来的,实质它是一段HTML代码,加 JS.CSS ...
- linux常用20命令 --转载
玩过Linux的人都会知道,Linux中的命令的确是非常多,但是玩过Linux的人也从来不会因为Linux的命令如此之多而烦恼,因为我们只需要掌握我们最常用的命令就可以了.当然你也可以在使用时去找一下 ...
- 使用 Http 的 Post 方式与网络交互通信
package zw1; import java.io.BufferedReader;import java.io.BufferedWriter;import java.io.InputStream; ...
- ThreadLocal学习笔记
首先,ThreadLocal是Java语言提供的用于支持线程局部变量的标准实现类.很多时候,ThreadLocal与Synchronized在功能上有一定的共性,都可以用来解决多线程环境下线程安全问题 ...
- 频繁模式挖掘中Apriori、FP-Growth和Eclat算法的实现和对比
最近上数据挖掘的课程,其中学习到了频繁模式挖掘这一章,这章介绍了三种算法,Apriori.FP-Growth和Eclat算法:由于对于不同的数据来说,这三种算法的表现不同,所以我们本次就对这三种算法在 ...
- Angular2.js——数据显示
显示数据,即属性绑定机制把数据显示到用户界面上. 在Angular中最典型的数据显示方式,就是把HTML模板中的控件绑定到Angular组件的属性. 接下来介绍几种数据显示的语法和代码片段. 使用插值 ...
- bzoj1834 [ZJOI2010]网络扩容
Description 给定一张有向图,每条边都有一个容量C和一个扩容费用W.这里扩容费用是指将容量扩大1所需的费用.求: 1. 在不扩容的情况下,1到N的最大流: 2. 将1到N的最大流增加K所需的 ...
- Less与Sass
less 1.变量 声明变量:@变量名:变量值 使用变量:@变量名 >>>Less中变量的类型 ①数字类:1 100px ②字符串:无引号字符串[red] 有引号字符串[&qu ...
- 使用Apache Spark 对 mysql 调优 查询速度提升10倍以上
在这篇文章中我们将讨论如何利用 Apache Spark 来提升 MySQL 的查询性能. 介绍 在我的前一篇文章Apache Spark with MySQL 中介绍了如何利用 Apache Spa ...
- java入门基础
什么是java? java是一门编程语言 编程语言有很多种 你比如 C语言 等等 为什么学习java呢! 因为你要和计算机交互 当然了你用汉语跟她说她听不懂 所以你要学习编程语言 那么额咱们的ja ...