一:逻辑回归(Logistic Regression)

  背景:假设你是一所大学招生办的领导,你依据学生的成绩,给与他入学的资格。现在有这样一组以前的数据集ex2data1.txt,第一列表示第一次测验的分数,第二列表示第二次测验的分数,第三列1表示允许入学,0表示不允许入学。现在依据这些数据集,设计出一个模型,作为以后的入学标准。

  

  我们通过可视化这些数据集,发现其与某条直线方程有关,而结果又只有两类,故我们接下来使用逻辑回归去拟合该数据集。

  

  1,回归方程的脚本ex2.m:

%% Machine Learning Online Class - Exercise : Logistic Regression
%
% Instructions
% ------------
%
% This file contains code that helps you get started on the logistic
% regression exercise. You will need to complete the following functions
% in this exericse:
%
% sigmoid.m
% costFunction.m
% predict.m
% costFunctionReg.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
% %% Initialization
clear ; close all; clc %% Load Data
% The first two columns contains the exam scores and the third column
% contains the label. data = load('ex2data1.txt');
X = data(:, [, ]); y = data(:, ); %% ==================== Part : Plotting ====================
% We start the exercise by first plotting the data to understand the
% the problem we are working with. fprintf(['Plotting data with + indicating (y = 1) examples and o ' ...
'indicating (y = 0) examples.\n']); plotData(X, y); % Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score') % Specified in plot order
legend('Admitted', 'Not admitted')
hold off; fprintf('\nProgram paused. Press enter to continue.\n');
pause; %% ============ Part : Compute Cost and Gradient ============
% In this part of the exercise, you will implement the cost and gradient
% for logistic regression. You neeed to complete the code in
% costFunction.m % Setup the data matrix appropriately, and add ones for the intercept term
[m, n] = size(X); % Add intercept term to x and X_test
X = [ones(m, ) X]; % Initialize fitting parameters
initial_theta = zeros(n + , ); % Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y); fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n'); % Compute and display cost and gradient with non-zero theta
test_theta = [-; 0.2; 0.2];
[cost, grad] = costFunction(test_theta, X, y); fprintf('\nCost at test theta: %f\n', cost);
fprintf('Expected cost (approx): 0.218\n');
fprintf('Gradient at test theta: \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n 0.043\n 2.566\n 2.647\n'); fprintf('\nProgram paused. Press enter to continue.\n');
pause; %% ============= Part : Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta. % Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', ); % Run fminunc to obtain the optimal theta
% This function will return theta and the cost
[theta, cost] = ...
fminunc(@(t)(costFunction(t, X, y)), initial_theta, options); % Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('Expected cost (approx): 0.203\n');
fprintf('theta: \n');
fprintf(' %f \n', theta);
fprintf('Expected theta (approx):\n');
fprintf(' -25.161\n 0.206\n 0.201\n'); % Plot Boundary
plotDecisionBoundary(theta, X, y); % Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score') % Specified in plot order
legend('Admitted', 'Not admitted')
hold off; fprintf('\nProgram paused. Press enter to continue.\n');
pause; %% ============== Part : Predict and Accuracies ==============
% After learning the parameters, you'll like to use it to predict the outcomes
% on unseen data. In this part, you will use the logistic regression model
% to predict the probability that a student with score on exam and
% score on exam will be admitted.
%
% Furthermore, you will compute the training and test set accuracies of
% our model.
%
% Your task is to complete the code in predict.m % Predict probability for a student with score on exam
% and score on exam prob = sigmoid([ ] * theta);
fprintf(['For a student with scores 45 and 85, we predict an admission ' ...
'probability of %f\n'], prob);
fprintf('Expected value: 0.775 +/- 0.002\n\n'); % Compute accuracy on our training set
p = predict(theta, X); fprintf('Train Accuracy: %f\n', mean(double(p == y)) * );
fprintf('Expected accuracy (approx): 89.0\n');
fprintf('\n');

ex2.m

  

  2,可视化数据plotData.m:

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure
% PLOTDATA(x,y) plots the data points with + for the positive examples
% and o for the negative examples. X is assumed to be a Mx2 matrix. % Create New Figure
figure; hold on; % ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
% 2D plot, using the option 'k+' for the positive
% examples and 'ko' for the negative examples.
% pos=find(y==);
neg=find(y==);
plot(X(pos,),X(pos,),'k+','LineWidth',,'MarkerSize',);
plot(X(neg,),X(neg,),'ko','MarkerFaceColor','y','MarkerSize',); % ========================================================================= hold off; end

plotData.m

  

  3,逻辑回归的逻辑函数(Sigmoid Function/Logistic Function):

  $h_{\theta}(x)=g(\theta^{T}x)$ :表示在输入为$x$,预测为$y=1$的概率

  $g(z)=\frac{1}{1+e^{-z}}$  

function g = sigmoid(z)
%SIGMOID Compute sigmoid function
% g = SIGMOID(z) computes the sigmoid of z. % You need to return the following variables correctly
g = zeros(size(z)); % ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
% vector or scalar). g=./(+exp(-z)); % ============================================================= end

sigmoid.m

  4,逻辑回归的代价函数:

  $J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}log(h_\theta(x^{(i)}))+(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))]$

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
% J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
% parameter for logistic regression and the gradient of the cost
% w.r.t. to the parameters. % Initialize some useful values
m = length(y); % number of training examples % You need to return the following variables correctly
J = ;
grad = zeros(size(theta)); % ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
% You should set J to the cost.
% Compute the partial derivatives and set grad to the partial
% derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
% h=sigmoid(X*theta); %求hθ(x)
J=-sum(y.*log(h)+(-y).*log(-h))/m; %代价函数 grad=(X')*(h-y)./m; %梯度下降,没有学习速率α,之后给我们调用内置函数fminunc使用 ## h=sigmoid(X*theta);
##J=sum(-y'*log(h)-(1-y)'*log(-h))/m;
##grad=((h-y)'*X)/m; % ============================================================= end

costFunction.m

  5,带学习速率$\alpha$的梯度下降:

  $\theta_j:=\theta_j-\frac{\alpha}{m }\sum_{i=1}^{m}[(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j]$

  

  不带学习速率$\alpha$的梯度下降(给之后fminunc作为梯度下降使用):

  $\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}[(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j]$

  使用内置fminunc函数来拟合参数$\theta$,之前我们是使用梯度下降来拟合参数$\theta$的,在这同样也能使用,不过我们这里使用内置fminunc函数来去拟合,它会自动选择学习速率$\alpha$,不需要我们手工选择,我们只需要给定一个迭代次数,一个写好的代价函数,初始化$\theta$,最后它会为我们找到最优的$\theta$,它像可以加强版的梯度下降法。

options = optimset('GradObj', 'on', 'MaxIter', );
[theta, cost] = ...
fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);//自己写好的costFunction函数

  

  6,根据拟合好的参数$\theta$,预测数据,例如我们想预测某学生第一次分数为45,第二次分数为85,该学生能入学的概率为:

prob = sigmoid([  ] * theta); %入学的概率

  预测样本X,我们可以看到预测的准确率为89%。

function p = predict(theta, X)
%PREDICT Predict whether the label is or using learned logistic
%regression parameters theta
% p = PREDICT(theta, X) computes the predictions for X using a
% threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1) m = size(X, ); % Number of training examples % You need to return the following variables correctly
p = zeros(m, ); % ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
% your learned logistic regression parameters.
% You should set p to a vector of 's and 1's
% %第一种
for i=:m
p(i,)=sigmoid(X(i,:)*theta)>=0.5; %预测每一个样本的结果,大于0.5为正向类
end; %第二种
%
## ans=sigmoid(X*theta);
## for i=:m
## if(ans(i,)>=0.5)
## p(i,)=;
## else
## p(i,)=;
## end % ========================================================================= end

predict.m

二:正则化逻辑回归(Regularized logistic regression):

  背景:假如你是某所工厂的管理员,该工厂生产芯片,每片芯片要经过两次测试后,达到标准方可通过,现在有一组以前的数据集ex2data2.txt,第一列为第一次测试的结果,第二列为第二次测试的结果,第三列1表示该芯片合格,0表示不合格。现在要你通过这些数据,拟合出一个模型,这个模型将作为以后判断芯片是否合格的标准。

  

  我们通过可视化这些数据集,发现其与某条复杂的曲线方程有关,而数据集只有两个特征$x_1$和$x_2$,显然是拟合不出曲线,那么我们可以通过原本的两个特征创造出更多的特征,将原本的特征映射为6次幂,这样我们就得到了28维的特征向量。当特征多了的话,很可能会出现过拟合,显然这不是我们想要的(即是它能很好的拟合原训练集,但预测新样本的能力会很低)。

构造更多的特征:

function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
% MAPFEATURE(X1, X2) maps the two input features
% to quadratic features used in the regularization exercise.
%
% Returns a new feature array with more features, comprising of
% X1, X2, X1.^, X2.^, X1*X2, X1*X2.^, etc..
%
% Inputs X1, X2 must be the same size
% degree = ;
out = ones(size(X1(:,)));
for i = :degree
for j = :i
out(:, end+) = (X1.^(i-j)).*(X2.^j);
end
end end

mapFeature.m

所以这时我们使用正则化(Regularization)来解决过拟合的问题。

  1,正则化回归的脚本ex2.m: 

%% Machine Learning Online Class - Exercise : Logistic Regression
%
% Instructions
% ------------
%
% This file contains code that helps you get started on the second part
% of the exercise which covers regularization with logistic regression.
%
% You will need to complete the following functions in this exericse:
%
% sigmoid.m
% costFunction.m
% predict.m
% costFunctionReg.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
% %% Initialization
clear ; close all; clc %% Load Data
% The first two columns contains the X values and the third column
% contains the label (y). data = load('ex2data2.txt');
X = data(:, [, ]); y = data(:, ); plotData(X, y); % Put some labels
hold on; % Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2') % Specified in plot order
legend('y = 1', 'y = 0')
hold off; %% =========== Part : Regularized Logistic Regression ============
% In this part, you are given a dataset with data points that are not
% linearly separable. However, you would still like to use logistic
% regression to classify the data points.
%
% To do so, you introduce more features to use -- in particular, you add
% polynomial features to our data matrix (similar to polynomial
% regression).
% % Add Polynomial Features % Note that mapFeature also adds a column of ones for us, so the intercept
% term is handled
X = mapFeature(X(:,), X(:,)); %c从原来的二维变成了28(+1截距项)维,m* % Initialize fitting parameters
initial_theta = zeros(size(X, ), ); % Set regularization parameter lambda to
lambda = ; % Compute and display initial cost and gradient for regularized logistic
% regression
[cost, grad] = costFunctionReg(initial_theta, X, y, lambda); fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(:));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n'); fprintf('\nProgram paused. Press enter to continue.\n');
pause; % Compute and display cost and gradient
% with all-ones theta and lambda =
test_theta = ones(size(X,),);
[cost, grad] = costFunctionReg(test_theta, X, y, ); fprintf('\nCost at test theta (with lambda = 10): %f\n', cost);
fprintf('Expected cost (approx): 3.16\n');
fprintf('Gradient at test theta - first five values only:\n');
fprintf(' %f \n', grad(:));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.3460\n 0.1614\n 0.1948\n 0.2269\n 0.0922\n'); fprintf('\nProgram paused. Press enter to continue.\n');
pause; %% ============= Part : Regularization and Accuracies =============
% Optional Exercise:
% In this part, you will get to try different values of lambda and
% see how regularization affects the decision coundart
%
% Try the following values of lambda (, , , ).
%
% How does the decision boundary change when you vary lambda? How does
% the training set accuracy vary?
% % Initialize fitting parameters
initial_theta = zeros(size(X, ), ); % Set regularization parameter lambda to (you should vary this)
lambda = ; % Set Options
options = optimset('GradObj', 'on', 'MaxIter', ); % Optimize
[theta, J, exit_flag] = ...
fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options); % Plot Boundary
plotDecisionBoundary(theta, X, y);
hold on;
title(sprintf('lambda = %g', lambda)) % Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2') legend('y = 1', 'y = 0', 'Decision boundary')
hold off; % Compute accuracy on our training set
p = predict(theta, X); fprintf('Train Accuracy: %f\n', mean(double(p == y)) * );
fprintf('Expected accuracy (with lambda = 1): 83.1 (approx)\n');

ex2_reg.m

  2,正则化逻辑回归代价函数(忽略偏差项$\theta_0$的正则化):

  $J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}log(h_\theta(x^{(i)}))+(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))]+\frac{\lambda }{2m}\sum_{j=1}^{n}\theta_j^{2}$

  

  3,梯度下降:

  带学习速率:

    $\theta_0:=\theta_0-\alpha \frac{1}{m }\sum_{i=1}^{m}[(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_0]$   for $j=0$

    $\theta_j:=\theta_j-\alpha (\frac{1}{m }\sum_{i=1}^{m}[(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j]+\frac{\lambda }{m}\theta_j)$  for $j\geq 1$

  不带学习速率(给之后fminunc作为梯度下降使用):

    $\frac{\partial J(\theta)}{\partial \theta_0}=\frac{1}{m}\sum_{i=1}^{m}[(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_0]$  for $j=0$

    $\frac{\partial J(\theta)}{\partial \theta_j}=(\frac{1}{m}\sum_{i=1}^{m}[(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j])+\frac{\lambda }{m}\theta_j $ for $j\geq 1$

  

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
% J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
% theta as the parameter for regularized logistic regression and the
% gradient of the cost w.r.t. to the parameters. % Initialize some useful values
m = length(y); % number of training examples % You need to return the following variables correctly
J = ;
grad = zeros(size(theta)); % ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
% You should set J to the cost.
% Compute the partial derivatives and set grad to the partial
% derivatives of the cost w.r.t. each parameter in theta h=sigmoid(X*theta);
n=size(X,);
J=(-(y')*log(h)-(1-y)'*log(-h))/m+(lambda/(*m))*sum(theta([:n],:).^); %忽略偏差项θ()的影响 grad(,)=((X(:,)')*(h-y))/m; %梯度下降
grad([:n],:)=(X(:,[:n])')*(h-y)./m+(theta([2:n],:)).*(lambda/m); ##h=sigmoid(X*theta);
##theta(,)=;
##J=sum(-y'*log(h)-(1-y)'*log(-h))/m+lambda//m*sum(power(theta,));
##grad=((h-y)'*X)/m+lambda/m*theta';
% ============================================================= end

costFunctionReg.m

  我们可以选择不同的$\lambda$大小去拟合数据集并可视化,选择一个较优的$lambda$。

  4,预测方法跟逻辑回归差不多,只是现在加入要预测第一次分数为45,第二次分数为80时,要先将这两个特征放到mapFeature函数构造。

我的标签:做个有情怀的程序员。

Andrew Ng机器学习 二: Logistic Regression的更多相关文章

  1. (原创)Stanford Machine Learning (by Andrew NG) --- (week 3) Logistic Regression & Regularization

    coursera上面Andrew NG的Machine learning课程地址为:https://www.coursera.org/course/ml 我曾经使用Logistic Regressio ...

  2. Andrew Ng机器学习课程笔记(二)之逻辑回归

    Andrew Ng机器学习课程笔记(二)之逻辑回归 版权声明:本文为博主原创文章,转载请指明转载地址 http://www.cnblogs.com/fydeblog/p/7364636.html 前言 ...

  3. Andrew Ng机器学习课程6

    Andrew Ng机器学习课程6 说明 在前面尾随者台大机器学习基石课程和机器学习技法课程的设置,对机器学习所涉及到的大部分的知识有了一个较为全面的了解,可是对于没有动手敲代码并加以使用的情况,基本上 ...

  4. Andrew Ng机器学习课程10

    Andrew Ng机器学习课程10 a example 如果hypothesis set中的hypothesis是由d个real number决定的,那么用64位的计算机数据表示的话,那么模型的个数一 ...

  5. Andrew Ng机器学习课程笔记(四)之神经网络

    Andrew Ng机器学习课程笔记(四)之神经网络 版权声明:本文为博主原创文章,转载请指明转载地址 http://www.cnblogs.com/fydeblog/p/7365730.html 前言 ...

  6. Andrew Ng机器学习课程笔记--week1(机器学习介绍及线性回归)

    title: Andrew Ng机器学习课程笔记--week1(机器学习介绍及线性回归) tags: 机器学习, 学习笔记 grammar_cjkRuby: true --- 之前看过一遍,但是总是模 ...

  7. Andrew Ng机器学习课程笔记(五)之应用机器学习的建议

    Andrew Ng机器学习课程笔记(五)之 应用机器学习的建议 版权声明:本文为博主原创文章,转载请指明转载地址 http://www.cnblogs.com/fydeblog/p/7368472.h ...

  8. Andrew Ng机器学习课程笔记--汇总

    笔记总结,各章节主要内容已总结在标题之中 Andrew Ng机器学习课程笔记–week1(机器学习简介&线性回归模型) Andrew Ng机器学习课程笔记--week2(多元线性回归& ...

  9. Andrew Ng机器学习课程笔记(六)之 机器学习系统的设计

    Andrew Ng机器学习课程笔记(六)之 机器学习系统的设计 版权声明:本文为博主原创文章,转载请指明转载地址 http://www.cnblogs.com/fydeblog/p/7392408.h ...

随机推荐

  1. 【c++基础】C与C++接口相互调用

    前言 编译程序的时候出现错误,入口程序如果是cpp文件可以编译成功,如果是c程序则出错.一般这个问题是c与c++之间接口相互调用出现的问题. 出现的错误是undefined reference to ...

  2. [LeetCode] 227. Basic Calculator II 基本计算器 II

    Implement a basic calculator to evaluate a simple expression string. The expression string contains ...

  3. [LeetCode] 654. Maximum Binary Tree 最大二叉树

    Given an integer array with no duplicates. A maximum tree building on this array is defined as follo ...

  4. Lyrics of the song 99 Bottles of Beer

    99 bottles of beer on the wall, 99 bottles of beer.Take one down and pass it around, 98 bottles of b ...

  5. C#易失域、锁的分享,多线程

    C#多线程.易失域.锁的分享 一.多线程 windows系统是一个多线程的操作系统.一个程序至少有一个进程,一个进程至少有一个线程.进程是线程的容器,一个C#客户端程序开始于一个单独的线程,CLR(公 ...

  6. Oracle和Mysql中的字符串的拼接

    SQL允许两个或者多个字段之间进行计算,字符串类型的字段也不例外.比如我们需要 以"工号+姓名"的方式在报表中显示一个员工的信息,那么就需要把工号和姓名两个字符 串类型的字段拼接计 ...

  7. 【vim小小记】vim的复制粘贴(包括系统剪贴板)

    1.vim常用复制粘贴命令 Vim的复制粘贴命令无疑是y (yank),p(paster),加上yy,P PS: vim有个很有意思的约定(我觉得是一种约定),就是某个命令的大小写都是实现某种功能,只 ...

  8. LeetCode 223. 矩形面积(Rectangle Area)

    223. 矩形面积 223. Rectangle Area 题目描述 在二维平面上计算出两个由直线构成的矩形重叠后形成的总面积. 每个矩形由其左下顶点和右上顶点坐标表示,如图所示. LeetCode2 ...

  9. 从Asp .net到Asp core (第二篇)《Asp Core 的生命周期》

    前面一篇文章简单回顾了Asp .net的生命周期,也简单提到了Asp .net与Asp Core 的区别,我们说Asp Core不在使用Asp.netRuntime,所以它也没有了web程序生命周期中 ...

  10. 01_Android入门

    Android系统文件目录结构 / 代表系统的根目录 /data/app/ 存放着第三方的apk文件 /system/app/ 其中是系统中的应用安装文件 /data/data/packagename ...