CheeseZH: Stanford University: Machine Learning Ex4:Training Neural Network(Backpropagation Algorithm)
1. Feedforward and cost function;
2.Regularized cost function:
3.Sigmoid gradient
The gradient for the sigmoid function can be computed as:
where:
4.Random initialization
randInitializeWeights.m
function W = randInitializeWeights(L_in, L_out)
%RANDINITIALIZEWEIGHTS Randomly initialize the weights of a layer with L_in
%incoming connections and L_out outgoing connections
% W = RANDINITIALIZEWEIGHTS(L_in, L_out) randomly initializes the weights
% of a layer with L_in incoming connections and L_out outgoing
% connections.
%
% Note that W should be set to a matrix of size(L_out, + L_in) as
% the column row of W handles the "bias" terms
% % You need to return the following variables correctly
W = zeros(L_out, + L_in); % ====================== YOUR CODE HERE ======================
% Instructions: Initialize W randomly so that we break the symmetry while
% training the neural network.
%
% Note: The first row of W corresponds to the parameters for the bias units
%
epsilon_init = 0.12;
W = rand(L_out, + L_in) * * epsilon_init - epsilon_init; % ========================================================================= end
5.Backpropagation(using a for-loop for t=1:m and place steps 1-4 below inside the for-loop), with the tth iteration perfoming the calculation on the tth training example(x(t),y(t)).Step 5 will divide the accumulated gradients by m to obtain the gradients for the neural network cost function.
(1) Set the input layer's values(a(1)) to the t-th training example x(t). Perform a feedforward pass, computing the activations(z(2),a(2),z(3),a(3)) for layers 2 and 3.
(2) For each output unit k in layer 3(the output layer), set :
where yk = 1 or 0.
(3)For the hidden layer l=2, set:
(4) Accumulate the gradient from this example using the following formula. Note that you should skip or remove δ0(2).
(5) Obtain the(unregularized) gradient for the neural network cost function by dividing the accumulated gradients by 1/m:
nnCostFunction.m
function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
%NNCOSTFUNCTION Implements the neural network cost function for a two layer
%neural network which performs classification
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
% X, y, lambda) computes the cost and gradient of the neural network. The
% parameters for the neural network are "unrolled" into the vector
% nn_params and need to be converted back into the weight matrices.
%
% The returned parameter grad should be a "unrolled" vector of the
% partial derivatives of the neural network.
% % Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for our layer neural network
Theta1 = reshape(nn_params(:hidden_layer_size * (input_layer_size + )), ...
hidden_layer_size, (input_layer_size + )); Theta2 = reshape(nn_params(( + (hidden_layer_size * (input_layer_size + ))):end), ...
num_labels, (hidden_layer_size + )); % Setup some useful variables
m = size(X, ); % You need to return the following variables correctly
J = ;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2)); % ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
% following parts.
%
% Part : Feedforward the neural network and return the cost in the
% variable J. After implementing Part , you can verify that your
% cost function computation is correct by verifying the cost
% computed in ex4.m
%
% Part : Implement the backpropagation algorithm to compute the gradients
% Theta1_grad and Theta2_grad. You should return the partial derivatives of
% the cost function with respect to Theta1 and Theta2 in Theta1_grad and
% Theta2_grad, respectively. After implementing Part , you can check
% that your implementation is correct by running checkNNGradients
%
% Note: The vector y passed into the function is a vector of labels
% containing values from ..K. You need to map this vector into a
% binary vector of 's and 0's to be used with the neural network
% cost function.
%
% Hint: We recommend implementing backpropagation using a for-loop
% over the training examples if you are implementing it for the
% first time.
%
% Part : Implement regularization with the cost function and gradients.
%
% Hint: You can implement this around the code for
% backpropagation. That is, you can compute the gradients for
% the regularization separately and then add them to Theta1_grad
% and Theta2_grad from Part .
% %Part
%Theta1 has size *
%Theta2 has size *
%y hase size *
K = num_labels;
Y = eye(K)(y,:); %[ ]
a1 = [ones(m,),X];%[ ]
a2 = sigmoid(a1*Theta1'); %[5000 25]
a2 = [ones(m,),a2];%[ ]
h = sigmoid(a2*Theta2');%[5000 10] costPositive = -Y.*log(h);
costNegtive = (-Y).*log(-h);
cost = costPositive - costNegtive;
J = (/m)*sum(cost(:));
%Regularized
Theta1Filtered = Theta1(:,:end); %[ ]
Theta2Filtered = Theta2(:,:end); %[ ]
reg = (lambda/(*m))*(sumsq(Theta1Filtered(:))+sumsq(Theta2Filtered(:)));
J = J + reg; %Part
Delta1 = ;
Delta2 = ;
for t=:m,
%step
a1 = [ X(t,:)]; %[ ]
z2 = a1*Theta1'; %[1 25]
a2 = [ sigmoid(z2)];%[ ]
z3 = a2*Theta2'; %[1 10]
a3 = sigmoid(z3); %[ ]
%step
yt = Y(t,:);%[ ]
d3 = a3-yt; %[ ]
%step
% [ ] [ ] [ ]
d2 = (d3*Theta2Filtered).*sigmoidGradient(z2); %[ ]
%step
Delta1 = Delta1 + (d2'*a1);%[25 401]
Delta2 = Delta2 + (d3'*a2);%[10 26]
end; %step
Theta1_grad = (/m)*Delta1;
Theta2_grad = (/m)*Delta2; %Part
Theta1_grad(:,:end) = Theta1_grad(:,:end) + ((lambda/m)*Theta1Filtered);
Theta2_grad(:,:end) = Theta2_grad(:,:end) + ((lambda/m)*Theta2Filtered); % ------------------------------------------------------------- % ========================================================================= % Unroll gradients
grad = [Theta1_grad(:) ; Theta2_grad(:)]; end
6.Gradient checking
Let
and
for each i, that:
computeNumericalGradient.m
function numgrad = computeNumericalGradient(J, theta)
%COMPUTENUMERICALGRADIENT Computes the gradient using "finite differences"
%and gives us a numerical estimate of the gradient.
% numgrad = COMPUTENUMERICALGRADIENT(J, theta) computes the numerical
% gradient of the function J around theta. Calling y = J(theta) should
% return the function value at theta. % Notes: The following code implements numerical gradient checking, and
% returns the numerical gradient.It sets numgrad(i) to (a numerical
% approximation of) the partial derivative of J with respect to the
% i-th input argument, evaluated at theta. (i.e., numgrad(i) should
% be the (approximately) the partial derivative of J with respect
% to theta(i).)
% numgrad = zeros(size(theta));
perturb = zeros(size(theta));
e = 1e-;
for p = :numel(theta)
% Set perturbation vector
perturb(p) = e;
loss1 = J(theta - perturb);
loss2 = J(theta + perturb);
% Compute Numerical Gradient
numgrad(p) = (loss2 - loss1) / (*e);
perturb(p) = ;
end end
7.Regularized Neural Networks
for j=0:
for j>=1:
别人的代码:
https://github.com/jcgillespie/Coursera-Machine-Learning/tree/master/ex4
CheeseZH: Stanford University: Machine Learning Ex4:Training Neural Network(Backpropagation Algorithm)的更多相关文章
- CheeseZH: Stanford University: Machine Learning Ex3: Multiclass Logistic Regression and Neural Network Prediction
		Handwritten digits recognition (0-9) Multi-class Logistic Regression 1. Vectorizing Logistic Regress ... 
- CheeseZH: Stanford University: Machine Learning Ex5:Regularized Linear Regression and Bias v.s. Variance
		源码:https://github.com/cheesezhe/Coursera-Machine-Learning-Exercise/tree/master/ex5 Introduction: In ... 
- CheeseZH: Stanford University: Machine Learning Ex2:Logistic Regression
		1. Sigmoid Function In Logisttic Regression, the hypothesis is defined as: where function g is the s ... 
- CheeseZH: Stanford University: Machine Learning Ex1:Linear Regression
		(1) How to comput the Cost function in Univirate/Multivariate Linear Regression; (2) How to comput t ... 
- Machine Learning, Homework 9, Neural Nets
		Machine Learning, Homework 9, Neural NetsApril 15, 2019ContentsBoston Housing with a Single Layer an ... 
- 【MetaPruning】2019-ICCV-MetaPruning Meta Learning for Automatic Neural Network Channel Pruning-论文阅读
		MetaPruning 2019-ICCV-MetaPruning Meta Learning for Automatic Neural Network Channel Pruning Zechun ... 
- MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning
		MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning 2019-08-11 19:48:17 Paper: h ... 
- Stanford CS229 Machine Learning by Andrew Ng
		CS229 Machine Learning Stanford Course by Andrew Ng Course material, problem set Matlab code written ... 
- Machine Learning No.5: Neural networks
		1. advantage: when number of features is too large, so previous algorithm is not a good way to learn ... 
随机推荐
- Codeforces Round #354 (Div. 2) C. Vasya and String 二分
			C. Vasya and String 题目连接: http://www.codeforces.com/contest/676/problem/C Description High school st ... 
- git 的补丁使用方法
			1.生成补丁 format-patch可以基于分支进行打包,也可以基于上几次更新内容打包. 基于上几次内容打包 git format-patch HEAD^ 有几个^就会打几个patch,从最近一次 ... 
- Automatic WordPress Updates Using FTP/FTPS or SSH
			Introduction When working with WordPress in a more secure environment where websites are not entirel ... 
- XPROG-m编程器
			XPROG-m编程器是为取代较早版本的XPROG编程器而设计的. XPROG-m编程器硬件完全与XPROG编程器向上兼容,还具有其它许多功能. 该XPROG - M支持摩托罗拉68HC05,68HC0 ... 
- 制作MACOSX10.10.3/10.9安装启动盘U盘的教程
			下载MACOSX 10.10.3/10.9镜像文件,下载地址http://www.chinamac.com/download/mac14032.html1.准备好你需要的大于等于6G以上的U盘或者移动 ... 
- MySQL:按前缀批量删除表格
			想要实现mysql>drop table like "prefix_%" 没有直接可用的命令,不过可以通过mysql语法来组装, SELECT CONCAT( 'DROP T ... 
- 为什么说CLR是类型安全的
			CLR总是知道托管堆上的对象是什么类型,这是CLR类型安全的前提.托管堆上的每个对象都有一个"类型对象指针",指向托管堆上Type对象的一个实例.我们总是可以通过System.Ob ... 
- MVC路由之浅见
			1.定义路由.路由规则 public static void RegisterRoutes(RouteCollection routes) { routes.IgnoreRoute("{re ... 
- MVC单元测试,使用Repository模式、Ninject、Moq
			本篇使用Repository设计MVC项目,使用Ninject作为DI容器,借助Moq进行单元测试. 模型和EF上下文 模型很简单: public class Foo { public int Id ... 
- 【maven】排除maven中jar包依赖的解决过程   例子:spring cloud启动zipkin,报错maven依赖jar包冲突 Class path contains multiple SLF4J bindings.
			一直对于maven中解决jar包依赖问题的解决方法纠结不清: 下面这个例子可以说明一个很简单的解决方法: 项目启动报错: Connected to the target VM, address: '1 ... 
