Regularized logistic regression

要解决的问题是，给出了具有2个特征的一堆训练数据集，从该数据的分布可以看出它们并不是非常线性可分的，因此很有必要用更高阶的特征来模拟。例如本程序中个就用到了特征值的6次方来求解。

Data

To begin, load the files 'ex5Logx.dat' and ex5Logy.dat' into your program. This dataset represents the training set of a logistic regression problem with two features. To avoid confusion later, we will refer to the two input features contained in 'ex5Logx.dat' as and . So in the 'ex5Logx.dat' file, the first column of numbers represents the feature , which you will plot on the horizontal axis, and the second feature represents , which you will plot on the vertical axis.

After loading the data, plot the points using different markers to distinguish between the two classifications. The commands in Matlab/Octave will be:
x = load('ex5Logx.dat');

y = load('ex5Logy.dat');

figure

% Find the indices for the 2 classes

pos = find(y); neg = find(y == 0);

plot(x(pos, 1), x(pos, 2), '+')

hold on

plot(x(neg, 1), x(neg, 2), 'o')
After plotting your image, it should look something like this:

Model

the hypothesis function is

Let's look at the parameter in the sigmoid function .

In this exercise, we will assign to be all monomials (meaning polynomial terms) of and up to the sixth power:

To clarify this notation: we have made a 28-feature vector where

此时加入了规则项后的系统的损失函数为：

Newton’s method

Recall that the Newton's Method update rule is

1. is your feature vector, which is a 28x1 vector in this exercise.

2. is a 28x1 vector.

3. and are 28x28 matrices.

4. and are scalars.

5. The matrix following in the Hessian formula is a 28x28 diagonal matrix with a zero in the upper left and ones on every other diagonal entry.

After convergence, use your values of theta to find the decision boundary in the classification problem. The decision boundary is defined as the line where

Code

%载入数据

clc,clear,close all;

x = load('ex5Logx.dat');

y = load('ex5Logy.dat');

%画出数据的分布图

plot(x(find(y),),x(find(y),),'o','MarkerFaceColor','b')

hold on;

plot(x(find(y==),),x(find(y==),),'r+')

legend('y=1','y=0')

% Add polynomial features to x by

% calling the feature mapping function

% provided in separate m-file

x = map_feature(x(:,), x(:,));  %投影到高维特征空间

[m, n] = size(x);

% Initialize fitting parameters

theta = zeros(n, );

% Define the sigmoid function

g = inline('1.0 ./ (1.0 + exp(-z))'); 

% setup for Newton's method

MAX_ITR = ;

J = zeros(MAX_ITR, );

% Lambda is the regularization parameter

lambda = ;%lambda=,,，修改这个地方，运行3次可以得到3种结果。

% Newton's Method

for i = :MAX_ITR

    % Calculate the hypothesis function

    z = x * theta;

    h = g(z);

    % Calculate J (for testing convergence) -- 损失函数

    J(i) =(/m)*sum(-y.*log(h) - (-y).*log(-h))+ ...

    (lambda/(*m))*norm(theta([:end]))^;

    % Calculate gradient and hessian.

    G = (lambda/m).*theta; G() = ; % extra term for gradient

    L = (lambda/m).*eye(n); L() = ;% extra term for Hessian

    grad = ((/m).*x' * (h-y)) + G;

    H = ((/m).*x' * diag(h) * diag(1-h) * x) + L;

    % Here is the actual update

    theta = theta - H\grad;

end

% Plot the results

% We will evaluate theta*x over a

% grid of features and plot the contour

% where theta*x equals zero

% Here is the grid range

u = linspace(-, 1.5, );

v = linspace(-, 1.5, );

z = zeros(length(u), length(v));

% Evaluate z = theta*x over the grid

for i = :length(u)

    for j = :length(v)

        z(i,j) = map_feature(u(i), v(j))*theta;%这里绘制的并不是损失函数与迭代次数之间的曲线，而是线性变换后的值

    end

end

z = z'; % important to transpose z before calling contour

% Plot z =

% Notice you need to specify the range [, ]

contour(u, v, z, [, ], 'LineWidth', )%在z上画出为0值时的界面，因为为0时刚好概率为0.，符合要求

legend('y = 1', 'y = 0', 'Decision boundary')

title(sprintf('\\lambda = %g', lambda), 'FontSize', )

hold off

% Uncomment to plot J

% figure

% plot(:MAX_ITR-, J, 'o--', 'MarkerFaceColor', 'r', 'MarkerSize', )

% xlabel('Iteration'); ylabel('J')

Result

Regularized logistic regression的更多相关文章

machine learning(15) --Regularization:Regularized logistic regression
Regularization:Regularized logistic regression without regularization 当features很多时会出现overfitting现象,图 ...
matlab(7) Regularized logistic regression : mapFeature(将feature增多) and costFunctionReg
Regularized logistic regression : mapFeature(将feature增多) and costFunctionReg ex2_reg.m文件中的部分内容 %% == ...
matlab(6) Regularized logistic regression : plot data(画样本图)
Regularized logistic regression : plot data(画样本图) ex2data2.txt 0.051267,0.69956,1-0.092742,0.68494, ...
编程作业2.2：Regularized Logistic regression
题目在本部分的练习中,您将使用正则化的Logistic回归模型来预测一个制造工厂的微芯片是否通过质量保证(QA),在QA过程中,每个芯片都会经过各种测试来保证它可以正常运行.假设你是这个工厂的产品经 ...
matlab(8) Regularized logistic regression : 不同的λ(0,1,10,100)值对regularization的影响，对应不同的decision boundary\ 预测新的值和计算模型的精度predict.m
不同的λ(0,1,10,100)值对regularization的影响\ 预测新的值和计算模型的精度 %% ============= Part 2: Regularization and Accur ...
吴恩达机器学习笔记22-正则化逻辑回归模型(Regularized Logistic Regression)
针对逻辑回归问题,我们在之前的课程已经学习过两种优化算法:我们首先学习了使用梯度下降法来优化代价函数
Stanford机器学习---第三讲. 逻辑回归和过拟合问题的解决 logistic Regression & Regularization
原文:http://blog.csdn.net/abcjennifer/article/details/7716281 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
Machine Learning - 第3周（Logistic Regression、Regularization）
Logistic regression is a method for classifying data into discrete outcomes. For example, we might u ...
【机器学习】Octave 实现逻辑回归 Logistic Regression
ex2data1.txt ex2data2.txt 本次算法的背景是,假如你是一个大学的管理者,你需要根据学生之前的成绩(两门科目)来预测该学生是否能进入该大学. 根据题意,我们不难分辨出这是一种二分 ...

随机推荐

@property 的本质是什么？
将访问.变量.访问控制进行了绑定:编译器负责自动合成. @dynamic:不会自动合成成员变量和存取方法. @property 的本质是什么?@property = ivar + getter + s ...
echarts 总结：
options配置项: title: 图表标题的配置 tooltip: 鼠标悬浮的提示 toolbox: 工具栏 series: 数据项,是每一个个的数据对象,可以根据type配置每一项数据的图例. ...
nginx的gizp压缩
好处: 页面另存为大小比浏览器传输大小大很多.好处是加快传输.节省带宽. 原理: 浏览器 -> 请求 -> 声明可以接受的压缩方式[http 协议请 ...
[洛谷P3927]SAC E#1 - 一道中档题 Factorial
题目大意:求$n!$在$k(k>1)$进制下末尾0的个数. 解题思路:一个数在十进制转k进制时,我们用短除法来做.容易发现,如果连续整除p个k,则末尾有p个0. 于是问题转化为$n!$能连续整除 ...
vs2010和qt4.8.4配置
最近项目要求在vs中开发qt程序,安装过后发现代码每天提示功能.由于本人记忆力有限,特在网上收罗了些配置方法. vs安装目录采用默认,qt安装目录:C:\Qt\4.8.4vs 在系统环境变量新建QTD ...
移动App架构设计
移动App架构设计本文主要总结了几种经常使用的架构模式, 基本是层层递进的转载请注名出处 http://blog.csdn.net/uxyheaven, 良好的排版在https://github.c ...
《Java并发编程实战》第五章同步容器类读书笔记
一.同步容器类 1. 同步容器类的问题线程容器类都是线程安全的.可是当在其上进行符合操作则须要而外加锁保护其安全性. 常见符合操作包括: . 迭代 . 跳转(依据指定顺序找到当前元素的下一个元素) ...
git帮助命令
git帮助命令零.自己实例 cd D://software/code/PHP/phpStudy/PHPTutorial/WWW/github/m_Orchestrate git checkout - ...
rowcount和@@rowcount的区别
1 rowcount rowcount的作用就是用来限定后面的sql在返回指定的行数之后便停止处理,比如下面的示例, set rowcount 10select * from 表A 这样的查询只会返回 ...
AtCoder Beginner Contest 067 C - Splitting Pi
C - Splitting Pile Time limit : 2sec / Memory limit : 256MB Score : 300 points Problem Statement Snu ...

Regularized logistic regression

Regularized logistic regression的更多相关文章

随机推荐

热门专题