Regression analysis
Source: http://wenku.baidu.com/link?url=9KrZhWmkIDHrqNHiXCGfkJVQWGFKOzaeiB7SslSdW_JnXCkVHsHsXJyvGbDva4V5A-uuOl84mg5zkTECichHX_AsN0mZalfI9BzDFOeNe-G###
❤ Simple linear regression
1. Y = β0 + β1*X + e
where:
Y - dependent variable (response)
X - independent variable (predictor/explanatory)
β0 - intercept
β1 - slope of the regression line
e - random error
2. Y' = b0 + b1*X
where: Y' - predicted value of Y
e = Y - Y'
3. Least squarea regression minizes the sum of the square of the errors and can be used to estimate b0 and b1.
4. Measuring the fit of the estimated model.
- The varibility of Y
SST (Sum of Squared Total): total variability about the mean, SST = sum((Y - mean(Y))^2);
SSE (Sum of Squared Error): variability about the regression line, SSE = sum(e^2) = sum((Y - mean(Y'))^2), SSE is unexplained varibility;
SSR (Sum of Squares due to Regression): variability that is explained, SSR = sum((Y' - mean(Y))^2), SSR is explained varibility.
Note that SST = SSE + SSR.
- Coefficient of determination
r^2: proportion of explained variability by the regression equation.
0 <= r^2 = 1 - SSE/SST = SSR/SST <= 1
- Correlation coefficient
r: strength of the relationship between X and Y.
-1 <= r <= 1
5. Assumptions in the regression model
Errors are independent, normally distributed, with the mean of zero, with a constant variance.
The assumptions can be tested by using residual analysis.
6. MSE (Mean Squared Error)
Estimation of error variance of the regression equation.
s^2 = MSE = SSE / (n - k - 1)
where:
n - number of observations in the sample
k - number of independent variables
Standard deviation of the regression: s = sqrt(MSE) is also frequently used.
❤ Test the model for significance: F-test
Used to statistically test the null hypothesis H0: there is no linear relationship between Y and X (i.e. β1 = 0).
If p value is low, then we regect H0 and conclude there is linear relationship:
F = MSR / MSE
where: MSR = SSR / k
Good regression model should have significant F value and high r^2 value.
Statistical test can be performed on the regression coefficients. H0: the βs are 0.
For a simple linear regression, the test for regression coefficient gives the same information as the ones given by F-test.
❤ ANOVA tables
The general form of the ANOVA table is helpful for understanding the interrelatedness of error terms.
❤ Multiple regression
Similar to the simple regression model, but there are more than one X in the multiple regression models.
Y' = b0 + b1*X1 + b2*X2 + ... + bn*Xn
Note that if indenpendent variables is correlate to each other, colinearity or multicolinearity will happen. This will cause problems when intepreate variables individually although the overall model estimation may still be good.
Regression analysis的更多相关文章
- [ML学习笔记] 回归分析(Regression Analysis)
[ML学习笔记] 回归分析(Regression Analysis) 回归分析:在一系列已知自变量与因变量之间相关关系的基础上,建立变量之间的回归方程,把回归方程作为算法模型,实现对新自变量得出因变量 ...
- Regression Analysis Using Excel
Regression Analysis Using Excel Setup By default, data analysis add-in is not enabled. Follow the st ...
- Functional mechanism: regression analysis under differential privacy_阅读报告
Functional mechanism: regression analysis under differential privacy 论文学习报告 组员:裴建新 赖妍菱 周子玉 2020 ...
- 7 Types of Regression Techniques you should know!
翻译来自:http://news.csdn.net/article_preview.html?preview=1&reload=1&arcid=2825492 摘要:本文解释了回归分析 ...
- STA 463 Simple Linear Regression Report
STA 463 Simple Linear Regression ReportSpring 2019 The goal of this part of the project is to perfor ...
- regression | p-value | Simple (bivariate) linear model | 线性回归 | 多重检验 | FDR | BH | R代码
P122, 这是IQR method课的第一次作业,需要统计检验,x和y是否显著的有线性关系. Assignment 1 1) Find a small bivariate dataset (pref ...
- Multiple Regression
Multiple Regression What is multiple regression? Multiple regression is regression analysis with mor ...
- Correlation and Regression
Correlation and Regression Sample Covariance The covariance between two random variables is a statis ...
- 7 Types of Regression Techniques
https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/ What is Regression Anal ...
随机推荐
- C语言实现泛型编程
泛型编程让你编写完全一般化并可重复使用的算法,其效率与针对某特定数据类型而设计的算法相同.在C语言中,可以通过一些手段实现这样的泛型编程.这里介绍一种方法——通过无类型指针void* 看下面的一个实现 ...
- iOS 从Xcode看应用支持横竖屏
要看一个应用是否支持横竖屏,要看Xcode里面的info.plist文件设置才清楚,每一个新建工程都会包含三个支持方式,即Supported interface orientations里面的就是 P ...
- 小试ildasm,ilasm,ilspy
选择了微耕的软件(为什么选择它,因为微耕的二次开发实在太牛了,只给文档,一切技术问题都不回答.文档也是只公开基本的东西) 第一个功能:换文字 第二个功能:插入一个新的程序集,在做某些事情前先做我想做的 ...
- CRM项目经验总结-从DAO层到链接数据池
IDAO接口 定义项目中所有板块相似功能 也是整个项目的根接口 public interface IDAO { /** * 新增数据 @param SQL sql查询语句 @param pa ...
- 使用xmarks同步 chrome ie firefox safari书签
xmarks是什么? Install Xmarks on each computer you use, and it seamlessly integrates with your web brows ...
- 什么是XMLA-- XML for Analysis
在我刚开始接触OLAP时,同事就告诉我 XMLA会让他使用更方便. 什么是XMLA? Providers 供应商 ActivePivot Hyperion Essbase IBM Infosphere ...
- WPF学习之路(十二)控件(HeaderedContent控件)
GroupBox 用来组织多种控件的常见控件,因为是内容空间,只能直接包含一项,需要使用面板一类的中间空间. Header和Content可以是任意元素 <GroupBox> <Gr ...
- C# 得到sqlserver 数据库存储过程,触发器,视图,函数 的定义
经常从 生产环境 到测试环境, 需要重新弄一整套的数据库环境, 除了表结构以及表结构数据,可以用动软代码生成器 生成之外, 像 存储过程,触发器,等,好像没有批量操作的,意义哥哥农比较麻烦, 所以最近 ...
- vi, vim 基本使用(2)
进入vi的命令vi filename :打开或新建文件,并将光标置于第一行首vi +n filename :打开文件,并将光标置于第n行首vi + filename :打开文件,并将光标置于最后一行首 ...
- 数据分页处理系列之三:Neo4j图数据分页处理
首先简单介绍下Neo4j,Neo4j是一个高性能的NOSQL图形数据库,它将结构化数据存储在网络上而不是表中,它是一个嵌入式的.基于磁盘的.具备完全的事务特性的Java持久化引擎,但是它将结构化数 ...