Google's Machine Learning Crash Course #02# Descending into ML
INDEX
How do we know if we have a good line
So as we said before, our model is something that we learned from data.
And there are lots of complicated model types and lots of interesting ways we can learn from data.
But we're gonna start with something very simple and familiar.
This will open the gateway to more sophisticated methods.
Let's train a first little model from data.
So here we've got a small data set.
On the X axis, we've got our input feature, which is showing housing square footage.
On our Y axis, we've got the target value that we're trying to predict of housing price.
So we're gonna try and create a model that takes in housing square footage as an input feature and predicts housing price as an output feature.
Here we've got lots of little labeled examples in our data set.
And I'm go ahead and channel our inner ninth grader to fit a line.
It can maybe take a look at our data set and fit a line that looks about right here. Maybe something like this.
And this line is now a model that predicts housing price given an input.
We can recall from algebra one that we can define this thing as Y = WX + B.
Now in high school algebra we would have said MX, here we say W because it's machine learning.
And this is referring to our weight vectors.
Now you'll notice that we've got a little subscript here because we might be in more than one dimension.
This B is a bias.
and the W gives us our slope.
How do we know if we have a good line?
Well, we might wanna think of some notion of loss here.
Loss is showing basically how well our line is doing at predicting any given example.
So we can define this loss by looking at the difference between the prediction for a given X value
and the true value for that example.
So this guy has some moderate size loss.
This guy has near-zero loss.
Here we've got exactly zero loss.
Here we probably have some positive loss.
Loss is always on a zero through positive scale.
How might we define loss? Well, that's something that we'll need to think about in a slightly more formal way.
So let's think about one convenient way to define loss for regression problems.
Not the only loss function, but one useful one to start out with.
We call this L2 loss, which is also known as squared error.
And it's a loss that's defined for an individual example by taking the square of the difference between our model's prediction and the true value.
Now obviously as we get further and further away from the true value, the loss that we suffer increases with a square.
Now, when we're training a model we don't care about minimizing loss on just one example, we care about minimizing loss across our entire data set.
Linear Regression
如何由 labeled examples 得到一个线性关系?(model)
假设我们要给温度(y)和蟋蟀每分钟的叫声(x)建立模型。可以这么做:
- 利用已有的数据作出散点图
- 画一条简单的直线近似两者的关系
- 利用直线的方程,写出线性表达式,例如 y = wx + b
这里的 y 就是我们试图预测的东西,w 是直线的坡度, b 是 y 轴的截距, x 是特征(feature)
如果想要预测一个尚未发生的情况,只需要把 feature 代入模型就可以了。一个复杂的模型依赖更多的 feature ,每个 feature 都有独立的权重:

Training and Loss
训练一个模型仅仅意味着得到一条好的直线(这需要好的权重 w 和偏差 b)。
在监督学习中,机器学习算法检查很多的 example 并找到一个具有最小 loss 的模型,这个过程叫做 empirical risk minimization
loss 是一个数字,表明模型的预测在单个 example 上有多糟糕,如果模型的预测是完美的,那么 loss 为零; 否则,loss 更大。
训练模型的目标是找到一组对于整体数据而言、具有低 loss 的权重 w 和偏差 b 。
一种比较流行的计算 loss 的方式就是 squared loss (也被叫做L2 loss):
Mean square error (MSE) 是每个 example 的平均 squared loss

现在我们知道训练模型的目标了:找到具有低 loss 的直线,怎样才算低 loss 呢?平均方差最小的就是了,接下来的问题是,我们如何逼近这条直线?
Google's Machine Learning Crash Course #02# Descending into ML的更多相关文章
- Google's Machine Learning Crash Course #01# Introducing ML & Framing & Fundamental terminology
INDEX Introducing ML Framing Fundamental machine learning terminology Introducing ML What you learn ...
- Google's Machine Learning Crash Course #03# Reducing Loss
Goal of training a model is to find a set of weights and biases that have low loss, on average, acro ...
- Google's Machine Learning Crash Course #04# First Steps with TensorFlow
1.使用 TensorFlow 的建议 Which API(s) should you use? You should use the highest level of abstraction tha ...
- 学习笔记之Machine Learning Crash Course | Google Developers
Machine Learning Crash Course | Google Developers https://developers.google.com/machine-learning/c ...
- 课程三(Structuring Machine Learning Projects),第一周(ML strategy(1)) —— 0.Learning Goals
Learning Goals Understand why Machine Learning strategy is important Apply satisficing and optimizin ...
- 课程三(Structuring Machine Learning Projects),第一周(ML strategy(1)) —— 1.Machine learning Flight simulator:Bird recognition in the city of Peacetopia (case study)
[]To help you practice strategies for machine learning, the following exercise will present an in-de ...
- 课程三(Structuring Machine Learning Projects),第二周(ML strategy(2)) —— 1.Machine learning Flight simulator:Autonomous driving (case study)
[中文翻译] 为了帮助您练习机器学习的策略, 在本周我们将介绍另一个场景, 并询问您将如何行动.我们认为, 这个工作在一个机器学习项目的 "模拟器" 将给一个任务, 告诉你一个机器 ...
- 课程三(Structuring Machine Learning Projects),第二周(ML strategy(2)) —— 0.Learning Goals
Learning Goals Understand what multi-task learning and transfer learning are Recognize bias, varianc ...
- ML Lecture 0-2: Why we need to learn machine learning?
在Github上也po了这个系列学习笔记(MachineLearningCourseNote),觉得写的不错的小伙伴欢迎来给项目点个赞哦~~ ML Lecture 0-2: Why we need t ...
随机推荐
- php curl常见错误:SSL错误、bool(false)
症状:php curl调用https出错 排查方法:在命令行中使用curl调用试试. 原因:服务器所在机房无法验证SSL证书. 解决办法:跳过SSL证书检查. curl_setopt($ch, CUR ...
- 2018C语言第三次作业
要求一 2.struct sk{int a; char *str)}*p; p->str++ 中的++ 加向? ++加向srt的地址. 要求二 题目1-计算平均成绩 1.设计思路 (1)主要 ...
- ubuntu16.04下安装文献管理工具mendelay
1.首先下载mendelay的安装包 到官网下载对应版本的安装包,官网地址:Download for Ubuntu and Kubuntu 16.04 LTS, 17.04 and Debian 2. ...
- Oracle卸载之Linux下卸载oracle11g的方法
1.使用SQL*PLUS停止数据库 如果不能通过sysdba登陆可以用nolog用户登陆后切换至sysdba [oracle@OracleTest oracle]$ sqlplus /nolog S ...
- android 控制POS机图文打印(二)
上一篇文章结束了ESC/POS的指令集,没看过的可以去看一下,可以当作工具文档来使用的 android 控制POS机图文打印(一) 这一篇正式介绍如何使用POS机来打印图文信息. 首先介绍一下,ESC ...
- 它是对 ACME(automated certificate management environment) 协议的实现,只要实现了 ACME 协议的客户端都可以跟它交互。
它是对 ACME(automated certificate management environment) 协议的实现,只要实现了 ACME 协议的客户端都可以跟它交互. https://mp.we ...
- 可视化url
http://blog.csdn.net/u011532367/article/list/1
- rank() over,dense_rank(),row_number() 的区别
转自:https://jingyan.baidu.com/article/597035521ff2ec8fc107404b.html rank() over是的作用是查出指定条件后进行一个排名,但是有 ...
- zookeeper源码导入
1 搭建步骤 1.1 到github中下载该项目 项目地址 https://github.com/apache/zookeeper.下载.zip包到本地解压. 解压后文件目录: 1.2 使用ant对源 ...
- 地理位置geo处理之mysql函数
目前越来越多的业务都会基于LBS,附近的人,外卖位置,附近商家等等,现就讨论离我最近这一业务场景的解决方案. 原文:https://www.jianshu.com/p/455d0468f6d4 目前已 ...