Regularization —— linear regression

本节主要是练习regularization项的使用原则。因为在机器学习的一些模型中，如果模型的参数太多，而训练样本又太少的话，这样训练出来的模型很容易产生过拟合现象。因此在模型的损失函数中，需要对模型的参数进行“惩罚”，这样的话这些参数就不会太大，而越小的参数说明模型越简单，越简单的模型则越不容易产生过拟合现象。

Regularized linear regression

From looking at this plot, it seems that fitting a straight line might be too simple of an approximation. Instead, we will try fitting a higher-order polynomial to the data to capture more of the variations in the points.

Let's try a fifth-order polynomial. Our hypothesis will be

This means that we have a hypothesis of six features, because are now all features of our regression. Notice that even though we are producing a polynomial fit, we still have a linear regression problem because the hypothesis is linear in each feature.

Since we are fitting a 5th-order polynomial to a data set of only 7 points, over-fitting is likely to occur. To guard against this, we will use regularization in our model.

Recall that in regularization problems, the goal is to minimize the following cost function with respect to :

The regularization parameter is a control on your fitting parameters. As the magnitues of the fitting parameters increase, there will be an increasing penalty on the cost function. This penalty is dependent on the squares of the parameters as well as the magnitude of . Also, notice that the summation after does not include

lamda 越大，训练出的模型越简单 —— 后一项的惩罚越大

Normal equations

Now we will find the best parameters of our model using the normal equations. Recall that the normal equations solution to regularized linear regression is

The matrix following is an diagonal matrix with a zero in the upper left and ones down the other diagonal entries. (Remember that is the number of features, not counting the intecept term). The vector and the matrix have the same definition they had for unregularized regression:

Using this equation, find values for using the three regularization parameters below:

a. (this is the same case as non-regularized linear regression)

b.

c.

Code

clc,clear

%加载数据

x = load('ex5Linx.dat');

y = load('ex5Liny.dat');

%显示原始数据

plot(x,y,'o','MarkerEdgeColor','b','MarkerFaceColor','r')

%将特征值变成训练样本矩阵

x = [ones(length(x),) x x.^ x.^ x.^ x.^];

[m n] = size(x);

n = n -;

%计算参数sidta，并且绘制出拟合曲线

rm = diag([;ones(n,)]);%lamda后面的矩阵

lamda = [  ]';

colortype = {'g','b','r'};

sida = zeros(n+,); %初始化参数sida

xrange = linspace(min(x(:,)),max(x(:,)))';

hold on;

for i = :

    sida(:,i) = inv(x'*x+lamda(i).*rm)*x'*y;%计算参数sida

    norm_sida = norm(sida) % norm 求sida的2阶范数

    yrange = [ones(size(xrange)) xrange xrange.^ xrange.^,...

        xrange.^ xrange.^]*sida(:,i);

    plot(xrange',yrange,char(colortype(i)))

    hold on

end

legend('traning data', '\lambda=0', '\lambda=1','\lambda=10')%注意转义字符的使用方法

hold off

Regularization —— linear regression的更多相关文章

machine learning(14) --Regularization:Regularized linear regression
machine learning(13) --Regularization:Regularized linear regression Gradient descent without regular ...
Matlab实现线性回归和逻辑回归: Linear Regression & Logistic Regression
原文:http://blog.csdn.net/abcjennifer/article/details/7732417 本文为Maching Learning 栏目补充内容,为上几章中所提到单参数线性 ...
Stanford机器学习---第二讲. 多变量线性回归 Linear Regression with multiple variable
原文:http://blog.csdn.net/abcjennifer/article/details/7700772 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
Stanford机器学习---第一讲. Linear Regression with one variable
原文:http://blog.csdn.net/abcjennifer/article/details/7691571 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
Regularized Linear Regression with scikit-learn
Regularized Linear Regression with scikit-learn Earlier we covered Ordinary Least Squares regression ...
机器学习笔记-1 Linear Regression with Multiple Variables(week 2)
1. Multiple Features note:X0 is equal to 1 2. Feature Scaling Idea: make sure features are on a simi ...
Simple tutorial for using TensorFlow to compute a linear regression
"""Simple tutorial for using TensorFlow to compute a linear regression. Parag K. Mita ...
第五次编程作业-Regularized Linear Regression and Bias v.s. Variance
1.正规化的线性回归 (1)代价函数 (2)梯度 linearRegCostFunction.m function [J, grad] = linearRegCostFunction(X, y, th ...
[UFLDL] Linear Regression & Classification
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html Deep learning:六(regulariz ...

随机推荐

PostgreSQL Replication之第九章与pgpool一起工作（3）
9.3 理解pgpool的架构一旦我们安装了pgpool,是时候来讨论软件架构了.从一个用户的角度看,pgpool就像一个正常的数据库服务器,您可以想连接任何其他服务器一样连接到它: pgpool ...
net实现ping的方法
class ServicePinger { private static readonly ILog log = LogManager.GetLogger(typeof(ServicePinger)) ...
ReactiveCocoa使用记录-网络登录事件
对于一个应用来说,绝大部分的时间都是在等待某些事件的发生或响应某些状态的变化,比如用户的触摸事件.应用进入后台.网络请求成功刷新界面等等,而维护这些状态的变化,常常会使代码变得非常复杂,难以扩展.而 ...
初学者指南：ZFS 是什么，为什么要使用 ZFS？
作者: John Paul 译者: LCTT Lv Feng 今天,我们来谈论一下 ZFS,一个先进的文件系统.我们将讨论 ZFS 从何而来,它是什么,以及为什么它在科技界和企业界如此受欢迎. 虽然我 ...
NodeJS学习笔记进阶 (3)Nodejs 进阶：Express 常用中间件 body-parser 实现解析(ok)
个人总结:Node.js处理post表单需要body-parser,这篇文章进行了详细的讲解. 摘选自网络写在前面 body-parser是非常常用的一个express中间件,作用是对http请求体 ...
《Python生物信息学数据管理》中文PDF+英文PDF+代码
生物信息学经典资料,解决生物学问题,通过"编程技法"的形式,涵盖尽可能多的组织.分析.表现结果的策略.在每章结尾都会有为生物研究者设计的编程题目,适合教学和自学.由六部分组成:Py ...
Laravel核心解读--HTTP内核
Http Kernel Http Kernel是Laravel中用来串联框架的各个核心组件来网络请求的,简单的说只要是通过public/index.php来启动框架的都会用到Http Kernel,而 ...
5.应用与模块（ng-app）
转自:https://www.cnblogs.com/best/tag/Angular/ 自动载入启动一个AngularJS应用,声明了ng-app的元素会成为$rootScope的起点每个HTML ...
Linux系统安全加固（一）
Linux系统安全加固(一) 去年8月,某所网站遭黑客攻击瘫痪虽然港交所随后及时启用备用系统,但还是致使7支股票1支债卷被迫停牌,次日再次遭受攻击而瘫痪:在去年年底继CSDN信息安全出现之后, ...
jasperreport 追加新报表（2）
用ireport做好模版后,如果要新加一个打印页,如果是新手,直接修改模版应该是理想情况, 可是什么数据源 feild,parameter,var,subreport ,还有路径, 真的可以让一个人疯 ...

Regularization —— linear regression

Regularization —— linear regression的更多相关文章

随机推荐

热门专题