Multivariate Linear Regression
Multiple Features
Linear regression with multiple variables is also known as "multivariate linear regression".
We now introduce notation for equations where we can have any number of input variables.
The multivariable form of the hypothesis function accommodating these multiple features is as follows:
In order to develop intuition about this function, we can think about θ0 as the basic price of a house, θ1 as the price per square meter, θ2 as the price per floor, etc. x1 will be the number of square meters in the house, x2 the number of floors, etc.
Using the definition of matrix multiplication, our multivariable hypothesis function can be concisely represented as:
This is a vectorization of our hypothesis function for one training example; see the lessons on vectorization to learn more.
Remark: Note that for convenience reasons in this course we assume .This allows us to do matrix operations with theta and x. Hence making the two vectors 'θ' and
match each other element-wise (that is, have the same number of elements: n+1).]
Gradient Descent For Multiple Variables
The gradient descent equation itself is generally the same form; we just have to repeat it for our 'n' features:
Gradient Descent in Practice I - Feature Scaling
Note: [6:20 - The average size of a house is 1000 but 100 is accidentally written instead]
We can speed up gradient descent by having each of our input values in roughly the same range. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.
The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same. Ideally:
These aren't exact requirements; we are only trying to speed things up. The goal is to get all input variables into roughly one of these ranges, give or take a few.
Two techniques to help with this are feature scaling and mean normalization. Feature scaling involves dividing the input values by the range (i.e. the maximum value minus the minimum value) of the input variable, resulting in a new range of just 1. Mean normalization involves subtracting the average value for an input variable from the values for that input variable resulting in a new average value for the input variable of just zero. To implement both of these techniques, adjust your input values as shown in this formula:
Where is the average of all the values for feature (i) and s_i is the range of values (max - min), or s_i is the standard deviation.
Note that dividing by the range, or dividing by the standard deviation, give different results. The quizzes in this course use range - the programming exercises use standard deviation.
For example, if x_i represents housing prices with a range of 100 to 2000 and a mean value of 1000, then, x_i := \dfrac{price-1000}{1900}.
Gradient Descent in Practice II - Learning Rate
Note: [5:20 - the x -axis label in the right graph should be θ rather than No. of iterations ]
Debugging gradient descent. Make a plot with number of iterations on the x-axis. Now plot the cost function, J(θ) over the number of iterations of gradient descent. If J(θ) ever increases, then you probably need to decrease α.
Automatic convergence test. Declare convergence if J(θ) decreases by less than E in one iteration, where E is some small value such as 10−3. However in practice it's difficult to choose this threshold value.
It has been proven that if learning rate α is sufficiently small, then J(θ) will decrease on every iteration.
To summarize:
If α is too small: slow convergence.
If α is too large: may not decrease on every iteration and thus may not converge.
Features and Polynomial Regression
We can improve our features and the form of our hypothesis function in a couple different ways.
We can combine multiple features into one. For example, we can combine x1 and x2 into a new feature x3 by taking x1⋅x2.
Polynomial Regression
Our hypothesis function need not be linear (a straight line) if that does not fit the data well.
We can change the behavior or curve of our hypothesis function by making it a quadratic, cubic or square root function (or any other form).
One important thing to keep in mind is, if you choose your features this way then feature scaling becomes very important.
Multivariate Linear Regression的更多相关文章
- Machine Learning - week 2 - Multivariate Linear Regression
Multiple Features 上一章中,hθ(x) = θ0 + θ1x,表示只有一个 feature.现在,有多个 features,所以 hθ(x) = θ0 + θ1x1 + θ2x2 + ...
- 多元线性回归(Multivariate Linear Regression)简单应用
警告:本文为小白入门学习笔记 数据集: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearnin ...
- Machine Learning – 第2周(Linear Regression with Multiple Variables、Octave/Matlab Tutorial)
Machine Learning – Coursera Octave for Microsoft Windows GNU Octave官网 GNU Octave帮助文档 (有900页的pdf版本) O ...
- 【转】Derivation of the Normal Equation for linear regression
I was going through the Coursera "Machine Learning" course, and in the section on multivar ...
- 机器学习---线性回归(Machine Learning Linear Regression)
线性回归是机器学习中最基础的模型,掌握了线性回归模型,有利于以后更容易地理解其它复杂的模型. 线性回归看似简单,但是其中包含了线性代数,微积分,概率等诸多方面的知识.让我们先从最简单的形式开始. 一元 ...
- Andrew Ng Machine Learning 专题【Linear Regression】
此文是斯坦福大学,机器学习界 superstar - Andrew Ng 所开设的 Coursera 课程:Machine Learning 的课程笔记. 力求简洁,仅代表本人观点,不足之处希望大家探 ...
- CheeseZH: Stanford University: Machine Learning Ex1:Linear Regression
(1) How to comput the Cost function in Univirate/Multivariate Linear Regression; (2) How to comput t ...
- Coursera machine learning 第二周 quiz 答案 Linear Regression with Multiple Variables
https://www.coursera.org/learn/machine-learning/exam/7pytE/linear-regression-with-multiple-variables ...
- Multivariate Adaptive Regression Splines (MARSplines)
Introductory Overview Regression Problems Multivariate Adaptive Regression Splines Model Selection a ...
随机推荐
- Spring中事务的XML方式[声明方式]
事务管理: 管理事务,管理数据,数据完整性和一致性 事务[业务逻辑] : 由一系列的动作[查询书价格,更新库存,更新余额],组成一个单元[买书业务], 当我们动作当中有一个错了,全错~ ACID 原子 ...
- golang sync.WaitGroup
//阻塞,直到WaitGroup中的所以过程完成. import ( "fmt" "sync" ) func wgProcess(wg *sync.WaitGr ...
- BZOJ2142: 礼物(拓展lucas)
Description 一年一度的圣诞节快要来到了.每年的圣诞节小E都会收到许多礼物,当然他也会送出许多礼物.不同的人物在小E 心目中的重要性不同,在小E心中分量越重的人,收到的礼物会越多.小E从商店 ...
- 使用网络TCP搭建一个简单文件下载器
说明:该篇博客是博主一字一码编写的,实属不易,请尊重原创,谢谢大家! 目录 一丶项目介绍 二丶服务器Server 三丶测试TCP server服务器 四丶客户端Client 五丶测试客户端向服务器下载 ...
- ArcGIS小技巧——多图层情况下交互显示效果
在使用ArcMap处理数据的过程中,通常需要对比不同图层之间的差异.或者查看影像配准情况,这时我通常会怀念ENVI中的强大的拉幕显示.闪烁.亮度和透明度显示工具...... 直到有一天,闲着没事干捣鼓 ...
- Hadoop学习小结
还在学校的时候,就知道Hadoop的存在了. 2012年在公司实习的时候,买了<Hadoop权威指南第2版>,大致看了下. 今年,抽空也大致喵了几眼. 最大的感悟就是:光看不做,还是不行. ...
- Arch Linux下配置Samba
本文记录笔者配置Samba的过程,供用于自用. sudo pacman -S samba sudo vim /etc/samba/smb.conf 添加以下内容 [global] dns pro ...
- 学习笔记:Vue——自定义指令
在Vue2.0中,代码复用和抽象的主要形式是组件.然鹅,有的情况下,你仍然需要对普通DOM元素进行底层操作,这时候就会用到自定义指令. 1.举个聚焦输入框的例子,全局注册focus指令 Vue.dir ...
- Sql Server 2014完全卸载
经历过好多次Sql server的安装与卸载,有时发现自己卸载的费时费力,单纯地卸载个软件就要吐血了,那么现在我觉得是时候整理一下了. 1.在运行中输入services.msc,然后找到所有跟Sql ...
- finalkeyword对JVM类载入器的影响
众所周知,当訪问一个类的变量或方法的时候.假设没有初始化该类.就会先去初始化一个类 可是,当这个类的变量为final的时候,就不一定了 请看以下的样例 package com.lala.shop; i ...