Regularization —— linear regression
本节主要是练习regularization项的使用原则。因为在机器学习的一些模型中,如果模型的参数太多,而训练样本又太少的话,这样训练出来的模型很容易产生过拟合现象。因此在模型的损失函数中,需要对模型的参数进行“惩罚”,这样的话这些参数就不会太大,而越小的参数说明模型越简单,越简单的模型则越不容易产生过拟合现象。
Regularized linear regression
From looking at this plot, it seems that fitting a straight line might be too simple of an approximation. Instead, we will try fitting a higher-order polynomial to the data to capture more of the variations in the points.
Let's try a fifth-order polynomial. Our hypothesis will be

This means that we have a hypothesis of six features, because
are now all features of our regression. Notice that even though we are producing a polynomial fit, we still have a linear regression problem because the hypothesis is linear in each feature.
Since we are fitting a 5th-order polynomial to a data set of only 7 points, over-fitting is likely to occur. To guard against this, we will use regularization in our model.
Recall that in regularization problems, the goal is to minimize the following cost function with respect to
:

The regularization parameter
is a control on your fitting parameters. As the magnitues of the fitting parameters increase, there will be an increasing penalty on the cost function. This penalty is dependent on the squares of the parameters as well as the magnitude of
. Also, notice that the summation after
does not include 
lamda 越大,训练出的模型越简单 —— 后一项的惩罚越大
Normal equations
Now we will find the best parameters of our model using the normal equations. Recall that the normal equations solution to regularized linear regression is
The matrix following
is an
diagonal matrix with a zero in the upper left and ones down the other diagonal entries. (Remember that
is the number of features, not counting the intecept term). The vector
and the matrix
have the same definition they had for unregularized regression:
Using this equation, find values for
using the three regularization parameters below:
a.
(this is the same case as non-regularized linear regression)
b.
c.
Code
clc,clear
%加载数据
x = load('ex5Linx.dat');
y = load('ex5Liny.dat'); %显示原始数据
plot(x,y,'o','MarkerEdgeColor','b','MarkerFaceColor','r') %将特征值变成训练样本矩阵
x = [ones(length(x),) x x.^ x.^ x.^ x.^];
[m n] = size(x);
n = n -; %计算参数sidta,并且绘制出拟合曲线
rm = diag([;ones(n,)]);%lamda后面的矩阵
lamda = [ ]';
colortype = {'g','b','r'};
sida = zeros(n+,); %初始化参数sida
xrange = linspace(min(x(:,)),max(x(:,)))';
hold on;
for i = :
sida(:,i) = inv(x'*x+lamda(i).*rm)*x'*y;%计算参数sida
norm_sida = norm(sida) % norm 求sida的2阶范数
yrange = [ones(size(xrange)) xrange xrange.^ xrange.^,...
xrange.^ xrange.^]*sida(:,i);
plot(xrange',yrange,char(colortype(i)))
hold on
end
legend('traning data', '\lambda=0', '\lambda=1','\lambda=10')%注意转义字符的使用方法
hold off



Regularization —— linear regression的更多相关文章
- machine learning(14) --Regularization:Regularized linear regression
machine learning(13) --Regularization:Regularized linear regression Gradient descent without regular ...
- Matlab实现线性回归和逻辑回归: Linear Regression & Logistic Regression
原文:http://blog.csdn.net/abcjennifer/article/details/7732417 本文为Maching Learning 栏目补充内容,为上几章中所提到单参数线性 ...
- Stanford机器学习---第二讲. 多变量线性回归 Linear Regression with multiple variable
原文:http://blog.csdn.net/abcjennifer/article/details/7700772 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
- Stanford机器学习---第一讲. Linear Regression with one variable
原文:http://blog.csdn.net/abcjennifer/article/details/7691571 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
- Regularized Linear Regression with scikit-learn
Regularized Linear Regression with scikit-learn Earlier we covered Ordinary Least Squares regression ...
- 机器学习笔记-1 Linear Regression with Multiple Variables(week 2)
1. Multiple Features note:X0 is equal to 1 2. Feature Scaling Idea: make sure features are on a simi ...
- Simple tutorial for using TensorFlow to compute a linear regression
"""Simple tutorial for using TensorFlow to compute a linear regression. Parag K. Mita ...
- 第五次编程作业-Regularized Linear Regression and Bias v.s. Variance
1.正规化的线性回归 (1)代价函数 (2)梯度 linearRegCostFunction.m function [J, grad] = linearRegCostFunction(X, y, th ...
- [UFLDL] Linear Regression & Classification
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html Deep learning:六(regulariz ...
随机推荐
- PostgreSQL Replication之第三章 理解即时恢复(2)
3.2 归档事务日志 看过图片之后,我们可以看看如何使这些东西进入工作状态.当谈到及时归档时,您需要做的第一件事是归档XLOG.PostgreSQL通过postgresql.conf提供了所有与归档相 ...
- codeforces 501 B Misha and Changing Handles 【map】
题意:给出n个名字变化,问一个名字最后变成了什么名字 先用map顺着做的,后来不对, 发现别人是将变化后的那个名字当成键值来做的,最后输出的时候先输出second,再输出first 写一下样例就好理解 ...
- Android chromium 2
Overview JNI (Java Native Interface) is the mechanism that enables Java code to call native function ...
- NOIP 2017 逛公园 记忆化搜索 最短路 好题
题目描述: 策策同学特别喜欢逛公园.公园可以看成一张N个点MM条边构成的有向图,且没有 自环和重边.其中1号点是公园的入口,N号点是公园的出口,每条边有一个非负权值, 代表策策经过这条边所要花的时间. ...
- 「POJ3237」Tree(树链剖分)
题意 给棵n个点的树.边有边权然后有三种操作 1.CHANGE i v 将编号为i的边权变为v 2.NEGATE a b 将a到b的所有边权变为相反数. 3.QUERY a b 查询a b路径的最大边 ...
- MySql系列之单表查询
单表查询的语法 SELECT 字段1,字段2... FROM 表名 WHERE 条件 GROUP BY field HAVING 筛选 ORDER BY field LIMIT 限制条数 关键字的执行 ...
- 解决HMC在IE浏览器无法登录的问题(Java Applet的使用问题)
管理IBM的小型机必须要用到HMC(Hardware Management Console),有时候在使用测试环境使用的时候我们会把HMC装到自己电脑上的虚拟机里面,然后管理小型机,但是在虚拟机里面使 ...
- webStrom的破解以及汉化
破解方法: 把JetbrainsCrack-3.1-release-enc.jar包放到bin目录下,然后把webstorm64.exe.vmoptions文件用文本打开, 在最后面加上一句-java ...
- python web开发 框架 模板 MVC
我是跟着廖雪峰老师学习的,对于我这样的纯小白来说,跟着他的网站学习,简直是被妈妈抱在怀里一样无忧无虑,这样的学习本来没有记录下来的必要,但是由于我的粗心大意,经常会出现一些错误,所以我决定把这些错误记 ...
- Configure Tomcat 7 to run Python CGI scripts in windows(Win7系统配置tomcat服务器,使用python进行cgi编程)
Pre-installation requirements1. Java2. Python steps1. Download latest version of Tomcat (Tomcat 7) f ...


diagonal matrix with a zero in the upper left and ones down the other diagonal entries. (Remember that
is the number of features, not counting the intecept term). The vector
and the matrix
have the same definition they had for unregularized regression:
(this is the same case as non-regularized linear regression)
