MachineLearningOnCoursera
Week Six
F Score
\[\begin{aligned}
P &= &\dfrac{2}{\dfrac{1}{P}+\dfrac{1}{R}}\\
&= &2 \dfrac{PR}{P+R}
\end{aligned}\]
Week Seven
Support Vector Machine
Cost Function
\[\begin{aligned}
&\min_{\theta}\lbrack-\dfrac{1}{m}{\sum_{y_{i}\in Y, x_{i} \in X}{y_{i} \log h(\theta^{T}x_{i})}+(1-y_{i})\log (1-h(\theta^{T}x_{i}))+\dfrac{\lambda}{2m} \sum_{\theta_{i} \in \theta}{\theta_{i}^{2}}}\rbrack\\
&\Rightarrow \min_{\theta}[-\sum_{y_{i} \in Y,x_{i} \in X}{y_{i} \log{h(\theta^{T}x_{i})}+(1-y_{i})\log(1-h(\theta^{T}x_{i}}))+\dfrac{\lambda}{2}\sum_{\theta_{i} \in \theta }{\theta^2_{i}}]\\
&\Rightarrow\min_{\theta}[C\sum_{y_{i} \in Y,x_{i} \in X}{y_{i} \log{h(\theta^{T}x_{i})}+(1-y_{i})\log(1-h(\theta^{T}x_{i}}))+\sum_{\theta_{i} \in \theta }{\theta^2_{i}}]\\
\end{aligned}\]
C is somewhat \(\dfrac{1}{\lambda}\).
- Large C:
- lower bias, high variance
- Small C:
- Higher bias, low variance
- Large \(\sigma^2\): Features \(f_{i}\) vary more smoothly.
- Higher bias, low variance
- Small \(\sigma^2\): Features \(f_{i}\) vary more sharply.
- Lower bias, high variance.
\[\begin{aligned}
& \dfrac{1}{2} \sum_{\theta_{i} \in \theta}{\theta_{i}^2}\\
&s.t&\theta^{T}x_{i} \geq 1, if\ y_{i} = 1&\\
&&\theta^{T}x_{i} \leq -1, if\ y_{i} = 0&
\end{aligned}\]
- Lower bias, high variance.
PS
If features are too many related to m, use logistic regression or SVM without a kernel.
If n is small, m is intermediate, use SVM with Gaussian kernal.
If n is small, m is large, add more features and use logistic regression or SVM without a kernel.
Week Eight
K-means
Cost Function
It try to minimize
\[\min_{\mu}{\dfrac{1}{m} \sum_{i=1}^{m} ||x^{(i)} - \mu_{c^{(i)}}}||^2\]
For the first loop, minimize the cost function by varing the centorid. For the second loop, it minimize the cost funcion with cetorid fixed and realign the centorid of every x in the training set.
Initialize
Initialize the centorids randomly. Randomly select k samples from the training set and set the centorids to these random selected samples.
It is possible that K-meas fall into the local minimum, So repeat to initialize the centorids randomly until the cost(distortion) is suitable for your purposes.
K-means converge all the time and it will not increase the cost during the training processs. More centoirds will decease the cost, if not, the k-means must fall into the local minimum and reinitialize the centorid until the cost is less.
PCA (Principal Component Analysis)
Restruct x from z meeting the below nonequation
\[1-\dfrac{\dfrac{1}{m} \sum_{i=1}^{m}||x^{(i)}-x^{(i)}_{approximation}||^2}{\dfrac{1}{m} \sum_{i=1}^{m} ||x^{(i)}||^2} \geq 0.99\]
PS:
the nonequation can be equal to the below
\[\begin{aligned}
[U, S, D] &= svd(sigma)\\
U_{reduce} &= U(:, 1:k)\\
z &= U_{reduce}' * x\\
x_{approximation} &= U_{reduce} * x\\\\
S &= \left( \begin{array}{ccc}
s_{11}&0&\cdots&0\\
0&s_{22}&\cdots&0\\
\vdots&\vdots&\ddots&\vdots\\
0&0&\cdots&s_{nn}
\end{array} \right)\\\\
\dfrac{\sum_{i=1}^{k}s_{ii}^2}{\sum_{i=1}^{m} s_{ii}^2} &\geq 0.99
\end{aligned}\]
Week Nine
Anomaly Detection
Gaussian Distribution
Multivariate Gaussian Distribution takes the connection of different variants into account
\[p(x) = \dfrac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(x-\mu)^{T}\Sigma^{-1}(x-\mu)}\]
Single variant Gaussian Distribution is a special example of Multivariate Gaussian Distribution, where
\[\Sigma = \left(\begin{array}{ccc}
\sigma_{11}&&&&\\
&\sigma_{22}&&&\\
&&\ddots&&\\
&&&\sigma_{nn}&\\
\end{array}\right)\]
When training the Anomaly Detection, we can use Maximum Likelihood Estimation
\[\begin{aligned}
\mu &= \dfrac{1}{m} \sum_{i=1}^{m}x^{(i)}\\
\Sigma &= \dfrac{1}{m} \sum_{i=1}^{m} (x^{(i)}-\mu)(x^{(i)}-\mu)^{T}
\end{aligned}\]
When we use single variant anomaly detection, the numerical cost is much cheaper than multivariant. But may need to add some new features to distinguish the normal and non-normal.
Recommender System
Cost Function
\[\begin{aligned}
J(X,\Theta) = \dfrac{1}{2} \sum_{(i,j):r(i,j)=1}((\theta^{(j)})^{T}x^{(i)}-y^{(i,j)})^2 + \dfrac{\lambda}{2}[\sum_{i=1}^{n_{m}}\sum_{k=1}^{n}(x_k^{(i)})^2 + \sum_{j=1}^{n_\mu} \sum_{k=1}^n(\theta_{k}^{(j)})^2]\\
J(X,\Theta) = \dfrac{1}{2}Sum\{(X\Theta'-Y).*R\} + \dfrac{\lambda}{2}(Sum\{\Theta.^2\} + Sum\{X.^2\}\\
\end{aligned}\]
\[\begin{aligned}
\dfrac{\partial J}{\partial X} = ((X\Theta'-Y).*R) \Theta + \lambda X\\
\dfrac{\partial J}{\partial \Theta} = ((X\Theta'-Y).*R)'X + \lambda \Theta
\end{aligned}\]
MachineLearningOnCoursera的更多相关文章
随机推荐
- VMware 安装 Mac OS 注意事项
Ø 简介 本文主要介绍使用 VMware 安装 Mac OS 的注意事项,主要包括一下内容: 1. 安装参考 2. 使用 VMware 运行 Mac OS 虚拟机注意事项 3. 解决 M ...
- C语言数据类型和C#的区别
- python-类对象以字典模式操作
#类对象以字典模式操作 class Person: def __init__(self): self.cache={} def __setitem__(self, key, value): #增加或修 ...
- js分享功能(微信,QQ,微博,空间,豆瓣等)
日常编程中,我们可能会碰到项目中的分享功能,各大平台都有分享接口和文档说明,当然也有一些一键分享插件,例如:sosh,iShare.js等等 但有些同学不想引用插件,那么我整理了一些常用的分享至平台功 ...
- Laravel框架中打印sql
在使用Laravel框架的时候,调试的时候,需要将查询的SQL输出校验,这是需要将SQL打印出来. 一.方法 DB::connection()->enableQueryLog(); // 开 ...
- Eclipse 开发设置编码格式--4个修改地方完美
背景:本人用这么久,因为大部分都是设定为UTF-8 就可以了,但是一些老项目居然是GBK格式,所以 工作空间.通常文件类型的编码都是UTF-8. 针对特殊项目设定特定格式,实际中本人对整个项目设定并不 ...
- C# - 学习总目录
C# - 基础 C# - 操作符 C# - 值类型和引用类型 C# - 表达式与语句 C# - 数组 C# - 引用类型 C# - 常用类 C# - 常用接口 C# - LINQ 语言集成查询 C# ...
- python学习第30天
tcp协议的socket server 并发效果验证客户端的合法性socket模块还有一些其他的方法
- 如何在html显示当前时间
下边那个是一直快速跳转的 <!doctype html> <html> <head> <meta charset="utf-8"&g ...
- ELK搭建<三>:安装Kibana
1.下载Kibana,如果需要低版本的,低版本下载 2.解压后,进入config修改配置文件kibana.yml : #运行的端口号 server.port: 5601 #服务器地址 server.h ...