Regularization —— linear regression
本节主要是练习regularization项的使用原则。因为在机器学习的一些模型中,如果模型的参数太多,而训练样本又太少的话,这样训练出来的模型很容易产生过拟合现象。因此在模型的损失函数中,需要对模型的参数进行“惩罚”,这样的话这些参数就不会太大,而越小的参数说明模型越简单,越简单的模型则越不容易产生过拟合现象。
Regularized linear regression
From looking at this plot, it seems that fitting a straight line might be too simple of an approximation. Instead, we will try fitting a higher-order polynomial to the data to capture more of the variations in the points.
Let's try a fifth-order polynomial. Our hypothesis will be
This means that we have a hypothesis of six features, because are now all features of our regression. Notice that even though we are producing a polynomial fit, we still have a linear regression problem because the hypothesis is linear in each feature.
Since we are fitting a 5th-order polynomial to a data set of only 7 points, over-fitting is likely to occur. To guard against this, we will use regularization in our model.
Recall that in regularization problems, the goal is to minimize the following cost function with respect to :
The regularization parameter is a control on your fitting parameters. As the magnitues of the fitting parameters increase, there will be an increasing penalty on the cost function. This penalty is dependent on the squares of the parameters as well as the magnitude of
. Also, notice that the summation after
does not include
lamda 越大,训练出的模型越简单 —— 后一项的惩罚越大
Normal equations
Now we will find the best parameters of our model using the normal equations. Recall that the normal equations solution to regularized linear regression is
The matrix following
is an
diagonal matrix with a zero in the upper left and ones down the other diagonal entries. (Remember that
is the number of features, not counting the intecept term). The vector
and the matrix
have the same definition they had for unregularized regression:
Using this equation, find values for
using the three regularization parameters below:
a.
(this is the same case as non-regularized linear regression)
b.
c.
Code
clc,clear
%加载数据
x = load('ex5Linx.dat');
y = load('ex5Liny.dat'); %显示原始数据
plot(x,y,'o','MarkerEdgeColor','b','MarkerFaceColor','r') %将特征值变成训练样本矩阵
x = [ones(length(x),) x x.^ x.^ x.^ x.^];
[m n] = size(x);
n = n -; %计算参数sidta,并且绘制出拟合曲线
rm = diag([;ones(n,)]);%lamda后面的矩阵
lamda = [ ]';
colortype = {'g','b','r'};
sida = zeros(n+,); %初始化参数sida
xrange = linspace(min(x(:,)),max(x(:,)))';
hold on;
for i = :
sida(:,i) = inv(x'*x+lamda(i).*rm)*x'*y;%计算参数sida
norm_sida = norm(sida) % norm 求sida的2阶范数
yrange = [ones(size(xrange)) xrange xrange.^ xrange.^,...
xrange.^ xrange.^]*sida(:,i);
plot(xrange',yrange,char(colortype(i)))
hold on
end
legend('traning data', '\lambda=0', '\lambda=1','\lambda=10')%注意转义字符的使用方法
hold off
Regularization —— linear regression的更多相关文章
- machine learning(14) --Regularization:Regularized linear regression
machine learning(13) --Regularization:Regularized linear regression Gradient descent without regular ...
- Matlab实现线性回归和逻辑回归: Linear Regression & Logistic Regression
原文:http://blog.csdn.net/abcjennifer/article/details/7732417 本文为Maching Learning 栏目补充内容,为上几章中所提到单参数线性 ...
- Stanford机器学习---第二讲. 多变量线性回归 Linear Regression with multiple variable
原文:http://blog.csdn.net/abcjennifer/article/details/7700772 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
- Stanford机器学习---第一讲. Linear Regression with one variable
原文:http://blog.csdn.net/abcjennifer/article/details/7691571 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
- Regularized Linear Regression with scikit-learn
Regularized Linear Regression with scikit-learn Earlier we covered Ordinary Least Squares regression ...
- 机器学习笔记-1 Linear Regression with Multiple Variables(week 2)
1. Multiple Features note:X0 is equal to 1 2. Feature Scaling Idea: make sure features are on a simi ...
- Simple tutorial for using TensorFlow to compute a linear regression
"""Simple tutorial for using TensorFlow to compute a linear regression. Parag K. Mita ...
- 第五次编程作业-Regularized Linear Regression and Bias v.s. Variance
1.正规化的线性回归 (1)代价函数 (2)梯度 linearRegCostFunction.m function [J, grad] = linearRegCostFunction(X, y, th ...
- [UFLDL] Linear Regression & Classification
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html Deep learning:六(regulariz ...
随机推荐
- ES6学习5 字符串的扩展
1.ES6 为字符串添加了遍历器接口,使得字符串可以被for...of循环遍历. for (let codePoint of 'foo') { console.log(codePoint) } // ...
- 如何解决本地仓库和远程仓库的冲突(Conflict)
Background: 我有一个github仓库管理我的代码,我将这个仓库的代码clone到我的工作电脑和私人电脑本地方便我上班和在家时都可以对我的代码进行更新. 一天,我在家修改过代码之后并未提交, ...
- 如何创建一个可以使用try.....catch.......捕获的异常
代码很简单,大家一看基本上就能明白(有一定的java基础,熟悉try......catch.....finally的使用方法) package com.nokia.test1; public clas ...
- OpenJDK源码研究笔记(九)-可恨却又可亲的的异常(NullPointerException)
可恨的异常 程序开发过程中,最讨厌异常了. 异常代表着程序出了问题,一旦出现,控制台会出现一屏又一屏的堆栈错误信息. 看着就让人心烦. 对于一个新人来讲,遇到异常经常会压力大,手忙脚乱,心生畏惧. 可 ...
- UVALive 5790 Ball Stacking DP
DP的方向真的很重要,这题做的时候死活想不出来,看了题解以后恍然大悟原来这么简单. 题意: 有n层堆成金字塔状的球,若你要选一个球,你必须把它上面那两个球取了,当然也可以一个不取.求选的球最大的权值和 ...
- TortoiseGit 弹出 git@xxx.com's password 对话框
安装完 tortoise git,用它克隆项目的时候,一直弹出git@xxx.com's password 对话框 解决的办法是,将ssh客户端默认的路径,换为git 安装目录下ssh.exe的路径就 ...
- flash3D学习1
今天正式学习flash3D. 先配置: watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0 ...
- 使用spring-loaded开源项目,实现java程序和web应用的热部署
JDK1.5之后提供了java.lang.instrument.Instrumentation,即java agent机制可以实现类的redefinition和retransform. redefin ...
- UVa 112 树求和
题意:给定一个数字,以及一个描写叙述树的字符序列,问存不存在一条从根到某叶子结点的路径使得其和等于那个数. 难点在于怎样处理字符序列.由于字符间可能有空格.换行等. 思路:本来想着用scanf的(后发 ...
- 七牛用户搭建c# sdk的图文讲解
Qiniu 七牛问题解答 问题描写叙述:非常多客户属于小白类型. 可是请不要随便喷七牛的文档站.由于须要一点http的专业知识才干了解七牛的api文档.如今我给大家弄个c# sdk的搭建步骤 问题解决 ...