INDEX

How do we know if we have a good line

So as we said before, our model is something that we learned from data.
And there are lots of complicated model types and lots of interesting ways we can learn from data.
But we're gonna start with something very simple and familiar.
This will open the gateway to more sophisticated methods.
Let's train a first little model from data.
So here we've got a small data set.
On the X axis, we've got our input feature, which is showing housing square footage.
On our Y axis, we've got the target value that we're trying to predict of housing price.
So we're gonna try and create a model that takes in housing square footage as an input feature and predicts housing price as an output feature.
Here we've got lots of little labeled examples in our data set.
And I'm go ahead and channel our inner ninth grader to fit a line.
It can maybe take a look at our data set and fit a line that looks about right here. Maybe something like this.
And this line is now a model that predicts housing price given an input.
We can recall from algebra one that we can define this thing as Y = WX + B.
Now in high school algebra we would have said MX, here we say W because it's machine learning.
And this is referring to our weight vectors.
Now you'll notice that we've got a little subscript here because we might be in more than one dimension.
This B is a bias.
and the W gives us our slope.
How do we know if we have a good line?
Well, we might wanna think of some notion of loss here.
Loss is showing basically how well our line is doing at predicting any given example.
So we can define this loss by looking at the difference between the prediction for a given X value
and the true value for that example.
So this guy has some moderate size loss.
This guy has near-zero loss.
Here we've got exactly zero loss.
Here we probably have some positive loss.
Loss is always on a zero through positive scale.
How might we define loss? Well, that's something that we'll need to think about in a slightly more formal way.
So let's think about one convenient way to define loss for regression problems.
Not the only loss function, but one useful one to start out with.
We call this L2 loss, which is also known as squared error.
And it's a loss that's defined for an individual example by taking the square of the difference between our model's prediction and the true value.
Now obviously as we get further and further away from the true value, the loss that we suffer increases with a square.
Now, when we're training a model we don't care about minimizing loss on just one example, we care about minimizing loss across our entire data set.

Linear Regression

如何由 labeled examples 得到一个线性关系?(model)

假设我们要给温度(y)和蟋蟀每分钟的叫声(x)建立模型。可以这么做:

  1. 利用已有的数据作出散点图
  2. 画一条简单的直线近似两者的关系
  3. 利用直线的方程,写出线性表达式,例如 y = wx + b

这里的 y 就是我们试图预测的东西,w 是直线的坡度, b 是 y 轴的截距, x 是特征(feature)

如果想要预测一个尚未发生的情况,只需要把 feature 代入模型就可以了。一个复杂的模型依赖更多的 feature ,每个 feature 都有独立的权重:

Training and Loss

训练一个模型仅仅意味着得到一条好的直线(这需要好的权重 w 和偏差 b)。

在监督学习中,机器学习算法检查很多的 example 并找到一个具有最小 loss 的模型,这个过程叫做 empirical risk minimization

loss 是一个数字,表明模型的预测在单个 example 上有多糟糕,如果模型的预测是完美的,那么 loss 为零; 否则,loss 更大。

训练模型的目标是找到一组对于整体数据而言、具有低 loss 的权重 w 和偏差 b 。

一种比较流行的计算 loss 的方式就是 squared loss (也被叫做L2 loss):

Mean square error (MSE) 是每个 example 的平均 squared loss

现在我们知道训练模型的目标了:找到具有低 loss 的直线,怎样才算低 loss 呢?平均方差最小的就是了,接下来的问题是,我们如何逼近这条直线?

Google's Machine Learning Crash Course #02# Descending into ML的更多相关文章

  1. Google's Machine Learning Crash Course #01# Introducing ML & Framing & Fundamental terminology

    INDEX Introducing ML Framing Fundamental machine learning terminology Introducing ML What you learn ...

  2. Google's Machine Learning Crash Course #03# Reducing Loss

    Goal of training a model is to find a set of weights and biases that have low loss, on average, acro ...

  3. Google's Machine Learning Crash Course #04# First Steps with TensorFlow

    1.使用 TensorFlow 的建议 Which API(s) should you use? You should use the highest level of abstraction tha ...

  4. 学习笔记之Machine Learning Crash Course | Google Developers

    Machine Learning Crash Course  |  Google Developers https://developers.google.com/machine-learning/c ...

  5. 课程三(Structuring Machine Learning Projects),第一周(ML strategy(1)) —— 0.Learning Goals

    Learning Goals Understand why Machine Learning strategy is important Apply satisficing and optimizin ...

  6. 课程三(Structuring Machine Learning Projects),第一周(ML strategy(1)) —— 1.Machine learning Flight simulator:Bird recognition in the city of Peacetopia (case study)

    []To help you practice strategies for machine learning, the following exercise will present an in-de ...

  7. 课程三(Structuring Machine Learning Projects),第二周(ML strategy(2)) —— 1.Machine learning Flight simulator:Autonomous driving (case study)

    [中文翻译] 为了帮助您练习机器学习的策略, 在本周我们将介绍另一个场景, 并询问您将如何行动.我们认为, 这个工作在一个机器学习项目的 "模拟器" 将给一个任务, 告诉你一个机器 ...

  8. 课程三(Structuring Machine Learning Projects),第二周(ML strategy(2)) —— 0.Learning Goals

    Learning Goals Understand what multi-task learning and transfer learning are Recognize bias, varianc ...

  9. ML Lecture 0-2: Why we need to learn machine learning?

    在Github上也po了这个系列学习笔记(MachineLearningCourseNote),觉得写的不错的小伙伴欢迎来给项目点个赞哦~~ ML Lecture 0-2: Why we need t ...

随机推荐

  1. 【转】C#线程篇

    C# 温故而知新: 线程篇(一) C# 温故而知新: 线程篇(二) C# 温故而知新:线程篇(三) C# 温故而知新: 线程篇(四)

  2. 通过微信服务号推送Zabbix告警

    近期看到一篇通过微信实现Zabbix告警的文章,但实践时发现,无法成功发送消息. 分析原因,应该是微信公众平台加强了登录验证,在登录时会需要管理员进行扫描二维码操作才能成功登陆后台: 而之前文章中的A ...

  3. Set-cookie无效(失效)

    今天做爬虫的时候遇到网站响应response返回的数据中有Set-Cookie,但是使用Linux的curl请求网页保存cookie始终为空,换句话说也就是Set-Cookie设置无效,所以我一直Go ...

  4. c字符检测函数

    isalpha(c)    /*判断是否为英文字符*/iscntrl(c)     /*判断是否为控制字符*/ isdigit(c)     /*判断是否为阿拉伯数字0到9*/isgraph(c)   ...

  5. 莫队学习笔记(未完成QAQ

    似乎之前讲评vjudge上的这题的时候提到过?但是并没有落实(...我发现我还有好多好多没落实?vjudge上的题目还没搞,然后之前考试的题目也都还没总结?天哪我哭了QAQ 然后这三道题我都是通过一道 ...

  6. Elasticsearch教程-从入门到精通(转)

    原文:http://mageedu.blog.51cto.com/4265610/1714522?utm_source=tuicool&utm_medium=referral 各位运维同行朋友 ...

  7. 【Loadrunner】【浙江移动项目手写代码】代码备份

    vuser_init(){        lr_start_transaction("login"); web_url("10.78.224.136:8080" ...

  8. 移除wordpress版本信息 删除无用信息

    wordpress页面头部有很多无用的信息,像wordpress版本信息.feed等,如何把它们删除或不让它们先是出来呢? 将下面的代码加入到当前主题的functions.php,可以适当酌情保留 & ...

  9. MyEclipse中jquery.js文件报missing semicolon的错误解决

    myeclipse的验证问题不影响jquery的应用,如果看着别扭,解决办法如下:选中你想去掉的js文件:右键选择 MyEclipse-->Exclude From Validation :然后 ...

  10. jmeter测试手机号码归属地

    jmeter测试手机号码归属地接口时,HTTP请求有以下两种书写方法: 1.请求和参数一同写在路径中 2.参数单独写在参数列表中 请求方法既可以使用GET方法又可以使用POST方法. 注意:“服务器名 ...