multiple r squared

adjusted r squared

http://web.maths.unsw.edu.au/~adelle/Garvan/Assays/GoodnessOfFit.html

Goodness-of-Fit Statistics

Sum of Squares Due to Error

This statistic measures the total deviation of the response values from the fit to the response values. It is also called the summed square of residuals and is usually labelled as SSE.

      SSE = Sum

(i=1 to n)

      {

wi 

      (

yi - fi

      )

2

    }

Here yi is the observed data value and fi is the predicted value from the fit. wi is the weighting applied to each data point, usually wi = 1.

A value closer to 0 indicates that the model has a smaller random error component, and that the fit will be more useful for prediction.

R-Square

This statistic measures how successful the fit is in explaining the variation of the data. Put another way, R-square is the square of the correlation between the response values and the predicted response values. It is also called the square of the multiple correlation coefficient and the coefficient of multiple determination.

R-square is defined as

      R-square = 1 - [Sum

(i=1 to n)

      {

wi

      (

y- fi

      )

2

      }] /[Sum

(i=1 to n)

      {

wi

      (

yi - yav

      )

2

    }] = 1 - SSE/SST

Here fi is the predicted value from the fit, yav is the mean of the observed data yi is the observed data value. wi is the weighting applied to each data point, usually wi=1. SSE is the sum of squares due to error and SST is the total sum of squares.

R-square can take on any value between 0 and 1, with a value closer to 1 indicating that a greater proportion of variance is accounted for by the model. For example, an R-square value of 0.8234 means that the fit explains 82.34% of the total variation in the data about the average.

If you increase the number of fitted coefficients in your model, R-square will increase although the fit may not improve in a practical sense. To avoid this situation, you should use the degrees of freedom adjusted R-square statistic described below.

Note that it is possible to get a negative R-square for equations that do not contain a constant term. Because R-square is defined as the proportion of variance explained by the fit, if the fit is actually worse than just fitting a horizontal line then R-square is negative. In this case, R-square cannot be interpreted as the square of a correlation. Such situations indicate that a constant term should be added to the model.

Degrees of Freedom Adjusted R-Square

This statistic uses the R-square statistic defined above, and adjusts it based on the residual degrees of freedom. The residual degrees of freedom is defined as the number of response values nminus the number of fitted coefficients m estimated from the response values.

v = n-m

v indicates the number of independent pieces of information involving the n data points that are required to calculate the sum of squares. Note that if parameters are bounded and one or more of the estimates are at their bounds, then those estimates are regarded as fixed. The degrees of freedom is increased by the number of such parameters.

The adjusted R-square statistic is generally the best indicator of the fit quality when you compare two models that are nested – that is, a series of models each of which adds additional coefficients to the previous model.

      adjusted R-square = 1 - SSE(

n

      -1)/SST(

v

    )

The adjusted R-square statistic can take on any value less than or equal to 1, with a value closer to 1 indicating a better fit. Negative values can occur when the model contains terms that do not help to predict the response.

Root Mean Squared Error

This statistic is also known as the fit standard error and the standard error of the regression. It is an estimate of the standard deviation of the random component in the data, and is defined as

      RMSE =

 s

      = (MSE)

½

where MSE is the mean square error or the residual mean square

      MSE=SSE/

v

Just as with SSE, an MSE value closer to 0 indicates a fit that is more useful for prediction.

r squared的更多相关文章

  1. 机器学习:衡量线性回归法的指标(MSE、RMSE、MAE、R Squared)

    一.MSE.RMSE.MAE 思路:测试数据集中的点,距离模型的平均距离越小,该模型越精确 # 注:使用平均距离,而不是所有测试样本的距离和,因为距离和受样本数量的影响 1)公式: MSE:均方误差 ...

  2. 线性函数拟合R语言示例

    线性函数拟合(y=a+bx) 1.       R运行实例 R语言运行代码如下:绿色为要提供的数据,黄色标识信息为需要保存的. x<-c(0.10,0.11, 0.12, 0.13, 0.14, ...

  3. R语言︱非结构化数据处理神器——rlist包

    本文作者:任坤,厦门大学王亚南经济研究院金融硕士生,研究兴趣为计算统计和金融量化交易,pipeR,learnR,rlist等项目的作者. 近年来,非关系型数据逐渐获得了更广泛的关注和使用.下面分别列举 ...

  4. R语言命令汇总

    > qqplot(spear,fastrankweight)> qqplot(spear,fastrankweight,main="title")> qqplot ...

  5. R ggplot2 线性回归

    摘自  http://f.dataguru.cn/thread-278300-1-1.html library(ggplot2) x=1:10y=rnorm(10)a=data.frame(x= x, ...

  6. r语言与dataframe

    什么是DataFrame 引用 r-tutor上的定义: DataFrame 是一个表格或者类似二维数组的结构,它的各行表示一个实例,各列表示一个变量. 没错,DataFrame就是类似于Excel表 ...

  7. R语言学习笔记(二十四):plyr包的用法

    plyr 这个包,提供了一组规范的数据结构转换形式. Input/Output list data frame array list llply() ldply() laply() data fram ...

  8. a note of R software write Function

    Functionals “To become significantly more reliable, code must become more transparent. In particular ...

  9. Advanced R之构造子集

    转发请声明出处:http://www.cnblogs.com/lizichao/p/4794733.html 构造子集 R构造子集的操作功能强大而且速度快.精通构造子集者可以用简洁的方式表达复杂的操作 ...

随机推荐

  1. 对于Redux的理解

    在移动端项目,经常会在不同view中进行传递数据,事件.当事件比较少时,我们可以通过常规的事件流方法,注册,发布事件 进行响应等等.但是项目中一个事件多处响应时候,就会使程序变得相当复杂.在现在的Vu ...

  2. AC日记——合唱队形 洛谷 P1901

    题目描述 N位同学站成一排,音乐老师要请其中的(N-K)位同学出列,使得剩下的K位同学排成合唱队形. 合唱队形是指这样的一种队形:设K位同学从左到右依次编号为1,2…,K,他们的身高分别为T1,T2, ...

  3. VUE之Router命令行警告:Named Route 'Home' has a default child route. 解决办法

    Named Route 'Home' has a default child route. When navigating to this named route (:to="{name: ...

  4. 2014 ACM/ICPC 亚洲区 北京站

    题目链接  2014北京区域赛 Problem A Problem B 直接DFS+剪枝 剪枝条件:当前剩余的方块数量cnt < 2 * max{a[i]} - 1,则停止往下搜. 因为这样搜下 ...

  5. java并发之hashmap

    在Java开发中经常会使用到hashmap,对于hashmap又了解多少,经常听到的一句话是hashmap是线程不安全的,那为什么是线程不安全的,如何才能保证线程安全,JDK又给我们提供了那些线程安全 ...

  6. Mybatis Generator插件和PageHelper使用

    最近,开始接触web项目开发,项目使用springboot和mybatis,以前一直以为开发过程中实体类,mybatis的xml文件都需要自己手动的去创建. 同事推荐说Mybatis Generato ...

  7. Linux下查看某个命令的参数

    1.一般每个命令都带有help参数,使用方法如下: shutdown --help 提示:shutdown为关机命令,在真实环境使用时需要root权限,比如前面加sudo. 2.使用man命令查看,使 ...

  8. Android NDK 环境配置

    1. 下载NDK 官方链接地址: http://developer.android.com/tools/sdk/ndk/index.html 下载下来的应该是这个东西(以后可能会有更新,但步骤变动不会 ...

  9. 移动端底部input被弹出的键盘遮挡

    https://developer.mozilla.org/zh-CN/docs/Web/API/Element/scrollIntoView 移动端input被键盘遮挡,事件是跳到可视区域! doc ...

  10. erlang 小程序:整数序列,搜索和为正的最长子序列

    近期学习了一下erlang, 编了个小程序 算法例如以下: 把參数分为三个 当前位置的前子序列(Save)(比方 -5, 1,2,-1, _, ... ) 当前位置为_时, 前子序列就是 1,2,-1 ...