Regression analysis
Source: http://wenku.baidu.com/link?url=9KrZhWmkIDHrqNHiXCGfkJVQWGFKOzaeiB7SslSdW_JnXCkVHsHsXJyvGbDva4V5A-uuOl84mg5zkTECichHX_AsN0mZalfI9BzDFOeNe-G###
❤ Simple linear regression
1. Y = β0 + β1*X + e
where:
Y - dependent variable (response)
X - independent variable (predictor/explanatory)
β0 - intercept
β1 - slope of the regression line
e - random error
2. Y' = b0 + b1*X
where: Y' - predicted value of Y
e = Y - Y'
3. Least squarea regression minizes the sum of the square of the errors and can be used to estimate b0 and b1.
4. Measuring the fit of the estimated model.
- The varibility of Y
SST (Sum of Squared Total): total variability about the mean, SST = sum((Y - mean(Y))^2);
SSE (Sum of Squared Error): variability about the regression line, SSE = sum(e^2) = sum((Y - mean(Y'))^2), SSE is unexplained varibility;
SSR (Sum of Squares due to Regression): variability that is explained, SSR = sum((Y' - mean(Y))^2), SSR is explained varibility.
Note that SST = SSE + SSR.
- Coefficient of determination
r^2: proportion of explained variability by the regression equation.
0 <= r^2 = 1 - SSE/SST = SSR/SST <= 1
- Correlation coefficient
r: strength of the relationship between X and Y.
-1 <= r <= 1
5. Assumptions in the regression model
Errors are independent, normally distributed, with the mean of zero, with a constant variance.
The assumptions can be tested by using residual analysis.
6. MSE (Mean Squared Error)
Estimation of error variance of the regression equation.
s^2 = MSE = SSE / (n - k - 1)
where:
n - number of observations in the sample
k - number of independent variables
Standard deviation of the regression: s = sqrt(MSE) is also frequently used.
❤ Test the model for significance: F-test
Used to statistically test the null hypothesis H0: there is no linear relationship between Y and X (i.e. β1 = 0).
If p value is low, then we regect H0 and conclude there is linear relationship:
F = MSR / MSE
where: MSR = SSR / k
Good regression model should have significant F value and high r^2 value.
Statistical test can be performed on the regression coefficients. H0: the βs are 0.
For a simple linear regression, the test for regression coefficient gives the same information as the ones given by F-test.
❤ ANOVA tables
The general form of the ANOVA table is helpful for understanding the interrelatedness of error terms.
❤ Multiple regression
Similar to the simple regression model, but there are more than one X in the multiple regression models.
Y' = b0 + b1*X1 + b2*X2 + ... + bn*Xn
Note that if indenpendent variables is correlate to each other, colinearity or multicolinearity will happen. This will cause problems when intepreate variables individually although the overall model estimation may still be good.
Regression analysis的更多相关文章
- [ML学习笔记] 回归分析(Regression Analysis)
[ML学习笔记] 回归分析(Regression Analysis) 回归分析:在一系列已知自变量与因变量之间相关关系的基础上,建立变量之间的回归方程,把回归方程作为算法模型,实现对新自变量得出因变量 ...
- Regression Analysis Using Excel
Regression Analysis Using Excel Setup By default, data analysis add-in is not enabled. Follow the st ...
- Functional mechanism: regression analysis under differential privacy_阅读报告
Functional mechanism: regression analysis under differential privacy 论文学习报告 组员:裴建新 赖妍菱 周子玉 2020 ...
- 7 Types of Regression Techniques you should know!
翻译来自:http://news.csdn.net/article_preview.html?preview=1&reload=1&arcid=2825492 摘要:本文解释了回归分析 ...
- STA 463 Simple Linear Regression Report
STA 463 Simple Linear Regression ReportSpring 2019 The goal of this part of the project is to perfor ...
- regression | p-value | Simple (bivariate) linear model | 线性回归 | 多重检验 | FDR | BH | R代码
P122, 这是IQR method课的第一次作业,需要统计检验,x和y是否显著的有线性关系. Assignment 1 1) Find a small bivariate dataset (pref ...
- Multiple Regression
Multiple Regression What is multiple regression? Multiple regression is regression analysis with mor ...
- Correlation and Regression
Correlation and Regression Sample Covariance The covariance between two random variables is a statis ...
- 7 Types of Regression Techniques
https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/ What is Regression Anal ...
随机推荐
- OBST(Optimal Binary Tree最优二叉搜索树)
二叉搜索树 二叉查找树(Binary Search Tree),(又:二叉搜索树,二叉排序树)它或者是一棵空树,或者是具有下列性质的二叉树: 若它的左子树不空,则左子树上所有结点的值均小于它的根结点的 ...
- 适当使用enum做数据字典 ( .net c# winform csharp asp.net webform )
在一些应用中,通常会用到很多由一些常量来进行描述的状态数据,比如性别(男.女),审核(未审核.已审核)等.在数据库中一般用数字形式来存储,比如0.1等. 不好的做法 经常看到一些应用(ps:最近又看到 ...
- 使用T-SQL进行活动目录查询
最近在某个项目中,需要针对TFS的用户按照所属的AD组来进行数据分析,但发现TFS中并没有存储用户所属的组信息,故考虑直接从AD中提取这个信息并存放在SQL Server的数据库表里面去. 经过一番G ...
- WPF学习之路(八)页面
传统的应用程序中有两类应用程序模式:桌面应用,Web应用.WPF的导航应用程序模糊了这两类应用程序的界限的第三类应用程序 WPF导航表现为两种形式,一是将导航内容寄宿于窗口,二是XAML浏览器应用程序 ...
- HTML基础(三)——css样式表
CSS(Cascading Style Sheet,叠层样式表),作用是美化HTML网页. /*注释区域*/此为注释语法 一.样式表 (一)样式表的分类 1.内联样式表 和HTML联合显示,控制精确, ...
- 网页实时聊天之js和jQuery实现ajax长轮询
众所周知,HTTP协议是无状态的,所以一次的请求都是一个单独的事件,和前后都没有联系.所以我们在解决网页实时聊天时就遇到一个问题,如何保证与服务器的长时间联系,从而源源不段地获取信息. 一直以来的方式 ...
- 烂泥:高负载均衡学习haproxy之TCP应用
本文由ilanniweb提供友情赞助,首发于烂泥行天下 在前几篇文章中,我们介绍了haproxy的配置参数,而且配置例子都是http协议(7层应用)的. 这篇文章,开始介绍haproxy的4层TCP应 ...
- android Timer使用方法
Timer属性:http://www.apihome.cn/api/java/Timer.html 声明创建: private Timer mTimer; protected void onCreat ...
- linux时间不同步问题
怪问题: 时间同步失效 系统: centos 6.6 2.6.32-504.el6.x86_64 情况: 定时任务中写了每分钟同步一次系统时间,定时任务执行成功,时间却未同步,奇怪? 现象: [ro ...
- JQuery中的extend函数
1.jQuery.fn.extend(object) 扩展 jQuery 元素集来提供新的方法(通常用来制作插件). 例如:增加两个插件方法. jQuery.fn.extend({ check: fu ...