Source: http://wenku.baidu.com/link?url=9KrZhWmkIDHrqNHiXCGfkJVQWGFKOzaeiB7SslSdW_JnXCkVHsHsXJyvGbDva4V5A-uuOl84mg5zkTECichHX_AsN0mZalfI9BzDFOeNe-G###

❤ Simple linear regression

1. Y = β0 + β1*X + e

where:

Y - dependent variable (response)

X - independent variable (predictor/explanatory)

β0 - intercept

β1 - slope of the regression line

e - random error

2. Y' = b0 + b1*X

where: Y' - predicted value of Y

e = Y - Y'

3. Least squarea regression minizes the sum of the square of the errors and can be used to estimate b0 and b1.

4. Measuring the fit of the estimated model.

- The varibility of Y

SST (Sum of Squared Total): total variability about the mean, SST = sum((Y - mean(Y))^2);

SSE (Sum of Squared Error): variability about the regression line, SSE = sum(e^2) = sum((Y - mean(Y'))^2), SSE is unexplained varibility;

SSR (Sum of Squares due to Regression): variability that is explained, SSR = sum((Y' - mean(Y))^2), SSR is explained varibility.

Note that SST = SSE + SSR.

- Coefficient of determination

r^2: proportion of explained variability by the regression equation.

0 <= r^2 = 1 - SSE/SST = SSR/SST <= 1

- Correlation coefficient

r: strength of the relationship between X and Y.

-1 <= r <= 1

5. Assumptions in the regression model

Errors are independent, normally distributed, with the mean of zero, with a constant variance.

The assumptions can be tested by using residual analysis.

6. MSE (Mean Squared Error)

Estimation of error variance of the regression equation.

s^2 = MSE = SSE / (n - k - 1)

where:

n - number of observations in the sample

k - number of independent variables

Standard deviation of the regression: s = sqrt(MSE) is also frequently used.

❤ Test the model for significance: F-test

Used to statistically test the null hypothesis H0: there is no linear relationship between Y and X (i.e. β1 = 0).

If p value is low, then we regect H0 and conclude there is linear relationship:

F = MSR / MSE

where: MSR = SSR / k

Good regression model should have significant F value and high r^2 value.

Statistical test can be performed on the regression coefficients. H0: the βs are 0.

For a simple linear regression, the test for regression coefficient gives the same information as the ones given by F-test.

❤ ANOVA tables

The general form of the ANOVA table is helpful for understanding the interrelatedness of error terms.

❤ Multiple regression

Similar to the simple regression model, but there are more than one X in the multiple regression models.

Y' = b0 + b1*X1 + b2*X2 + ... + bn*Xn

Note that if indenpendent variables is correlate to each other, colinearity or multicolinearity will happen. This will cause problems when intepreate variables individually although the overall model estimation may still be good.

Regression analysis的更多相关文章

  1. [ML学习笔记] 回归分析(Regression Analysis)

    [ML学习笔记] 回归分析(Regression Analysis) 回归分析:在一系列已知自变量与因变量之间相关关系的基础上,建立变量之间的回归方程,把回归方程作为算法模型,实现对新自变量得出因变量 ...

  2. Regression Analysis Using Excel

    Regression Analysis Using Excel Setup By default, data analysis add-in is not enabled. Follow the st ...

  3. Functional mechanism: regression analysis under differential privacy_阅读报告

    Functional mechanism: regression analysis under differential privacy 论文学习报告 组员:裴建新   赖妍菱    周子玉 2020 ...

  4. 7 Types of Regression Techniques you should know!

    翻译来自:http://news.csdn.net/article_preview.html?preview=1&reload=1&arcid=2825492 摘要:本文解释了回归分析 ...

  5. STA 463 Simple Linear Regression Report

    STA 463 Simple Linear Regression ReportSpring 2019 The goal of this part of the project is to perfor ...

  6. regression | p-value | Simple (bivariate) linear model | 线性回归 | 多重检验 | FDR | BH | R代码

    P122, 这是IQR method课的第一次作业,需要统计检验,x和y是否显著的有线性关系. Assignment 1 1) Find a small bivariate dataset (pref ...

  7. Multiple Regression

    Multiple Regression What is multiple regression? Multiple regression is regression analysis with mor ...

  8. Correlation and Regression

    Correlation and Regression Sample Covariance The covariance between two random variables is a statis ...

  9. 7 Types of Regression Techniques

    https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/ What is Regression Anal ...

随机推荐

  1. 【代码笔记】iOS-浮点数处理并去掉多余的0

    一,代码. - (void)viewDidLoad { [super viewDidLoad]; // Do any additional setup after loading the view. ...

  2. ajax async

    $.post("index.php?app=default&act=ajaxBigImage", {goods_id: goods_id},function(data){$ ...

  3. mysql动态行转列

    测试数据 DROP TABLE IF EXISTS `score`; CREATE TABLE `score` ( `id` ) NOT NULL AUTO_INCREMENT, `class` ) ...

  4. 获取session、request、parmeter的方法

    package com.hanqi.action; import java.util.Map; import com.opensymphony.xwork2.ActionContext; public ...

  5. SQL与NoSQL(关系型与非关系型)数据库的区别

    永远正确的经典答案依然是:具体问题具体分析. 数据表VS.数据集 关系型和非关系型数据库的主要差异是数据存储的方式.关系型数据天然就是表格式的,因此存储在数据表的行和列中.数据表可以彼此关联协作存储, ...

  6. javascript 基础教程[温故而知新一]

    子曰:“温故而知新,可以为师矣.”孔子说:“温习旧知识从而得知新的理解与体会,凭借这一点就可以成为老师了.“ 尤其是咱们搞程序的人,不管是不是全栈工程师,都是集十八般武艺于一身.不过有时候有些知识如果 ...

  7. nginx设置黑/白名单

    编辑nginx配置文件: server { listen ; server_name www.xxx.cn; #白名单 allow 192.168.1.200; deny all; #黑名单 #den ...

  8. windows下使用mysql双机热备功能

    一. 准备工作 1. 准备两台服务器(电脑),接入局域网中,使互相ping得通对方 2. 两台服务器都安装mysql-server-5.1,必须保证mysql的版本一致 3. 假设,服务器A:192. ...

  9. 快速操作Linux终端命令行的快捷键列表

    终端有很多快捷键,不太好记,常用的在这里 Ctrl+r 实现快速检索使用过的历史命令.Ctrl+r中r是retrieve中r.Ctrl+a:光标回到命令行首. (a:ahead)Ctrl+e:光标回到 ...

  10. linux 配合仅主机模式