1. Feedforward and cost function;

2.Regularized cost function:

3.Sigmoid gradient

The gradient for the sigmoid function can be computed as:

where:

4.Random initialization

randInitializeWeights.m

 function W = randInitializeWeights(L_in, L_out)
%RANDINITIALIZEWEIGHTS Randomly initialize the weights of a layer with L_in
%incoming connections and L_out outgoing connections
% W = RANDINITIALIZEWEIGHTS(L_in, L_out) randomly initializes the weights
% of a layer with L_in incoming connections and L_out outgoing
% connections.
%
% Note that W should be set to a matrix of size(L_out, + L_in) as
% the column row of W handles the "bias" terms
% % You need to return the following variables correctly
W = zeros(L_out, + L_in); % ====================== YOUR CODE HERE ======================
% Instructions: Initialize W randomly so that we break the symmetry while
% training the neural network.
%
% Note: The first row of W corresponds to the parameters for the bias units
%
epsilon_init = 0.12;
W = rand(L_out, + L_in) * * epsilon_init - epsilon_init; % ========================================================================= end

5.Backpropagation(using a for-loop for t=1:m and place steps 1-4 below inside the for-loop), with the tth iteration perfoming the calculation on the tth training example(x(t),y(t)).Step 5 will divide the accumulated gradients by m to obtain the gradients for the neural network cost function.

(1) Set the input layer's values(a(1)) to the t-th training example x(t). Perform a feedforward pass, computing the activations(z(2),a(2),z(3),a(3)) for layers 2 and 3.

(2) For each output unit k in layer 3(the output layer), set :

where yk = 1 or 0.

(3)For the hidden layer l=2, set:

(4) Accumulate the gradient from this example using the following formula. Note that you should skip or remove δ0(2).

(5) Obtain the(unregularized) gradient for the neural network cost function by dividing the accumulated gradients by 1/m:

nnCostFunction.m

 function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
%NNCOSTFUNCTION Implements the neural network cost function for a two layer
%neural network which performs classification
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
% X, y, lambda) computes the cost and gradient of the neural network. The
% parameters for the neural network are "unrolled" into the vector
% nn_params and need to be converted back into the weight matrices.
%
% The returned parameter grad should be a "unrolled" vector of the
% partial derivatives of the neural network.
% % Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for our layer neural network
Theta1 = reshape(nn_params(:hidden_layer_size * (input_layer_size + )), ...
hidden_layer_size, (input_layer_size + )); Theta2 = reshape(nn_params(( + (hidden_layer_size * (input_layer_size + ))):end), ...
num_labels, (hidden_layer_size + )); % Setup some useful variables
m = size(X, ); % You need to return the following variables correctly
J = ;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2)); % ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
% following parts.
%
% Part : Feedforward the neural network and return the cost in the
% variable J. After implementing Part , you can verify that your
% cost function computation is correct by verifying the cost
% computed in ex4.m
%
% Part : Implement the backpropagation algorithm to compute the gradients
% Theta1_grad and Theta2_grad. You should return the partial derivatives of
% the cost function with respect to Theta1 and Theta2 in Theta1_grad and
% Theta2_grad, respectively. After implementing Part , you can check
% that your implementation is correct by running checkNNGradients
%
% Note: The vector y passed into the function is a vector of labels
% containing values from ..K. You need to map this vector into a
% binary vector of 's and 0's to be used with the neural network
% cost function.
%
% Hint: We recommend implementing backpropagation using a for-loop
% over the training examples if you are implementing it for the
% first time.
%
% Part : Implement regularization with the cost function and gradients.
%
% Hint: You can implement this around the code for
% backpropagation. That is, you can compute the gradients for
% the regularization separately and then add them to Theta1_grad
% and Theta2_grad from Part .
% %Part
%Theta1 has size *
%Theta2 has size *
%y hase size *
K = num_labels;
Y = eye(K)(y,:); %[ ]
a1 = [ones(m,),X];%[ ]
a2 = sigmoid(a1*Theta1'); %[5000 25]
a2 = [ones(m,),a2];%[ ]
h = sigmoid(a2*Theta2');%[5000 10] costPositive = -Y.*log(h);
costNegtive = (-Y).*log(-h);
cost = costPositive - costNegtive;
J = (/m)*sum(cost(:));
%Regularized
Theta1Filtered = Theta1(:,:end); %[ ]
Theta2Filtered = Theta2(:,:end); %[ ]
reg = (lambda/(*m))*(sumsq(Theta1Filtered(:))+sumsq(Theta2Filtered(:)));
J = J + reg; %Part
Delta1 = ;
Delta2 = ;
for t=:m,
%step
a1 = [ X(t,:)]; %[ ]
z2 = a1*Theta1'; %[1 25]
a2 = [ sigmoid(z2)];%[ ]
z3 = a2*Theta2'; %[1 10]
a3 = sigmoid(z3); %[ ]
%step
yt = Y(t,:);%[ ]
d3 = a3-yt; %[ ]
%step
% [ ] [ ] [ ]
d2 = (d3*Theta2Filtered).*sigmoidGradient(z2); %[ ]
%step
Delta1 = Delta1 + (d2'*a1);%[25 401]
Delta2 = Delta2 + (d3'*a2);%[10 26]
end; %step
Theta1_grad = (/m)*Delta1;
Theta2_grad = (/m)*Delta2; %Part
Theta1_grad(:,:end) = Theta1_grad(:,:end) + ((lambda/m)*Theta1Filtered);
Theta2_grad(:,:end) = Theta2_grad(:,:end) + ((lambda/m)*Theta2Filtered); % ------------------------------------------------------------- % ========================================================================= % Unroll gradients
grad = [Theta1_grad(:) ; Theta2_grad(:)]; end

6.Gradient checking

Let

and

for each i, that:

computeNumericalGradient.m

 function numgrad = computeNumericalGradient(J, theta)
%COMPUTENUMERICALGRADIENT Computes the gradient using "finite differences"
%and gives us a numerical estimate of the gradient.
% numgrad = COMPUTENUMERICALGRADIENT(J, theta) computes the numerical
% gradient of the function J around theta. Calling y = J(theta) should
% return the function value at theta. % Notes: The following code implements numerical gradient checking, and
% returns the numerical gradient.It sets numgrad(i) to (a numerical
% approximation of) the partial derivative of J with respect to the
% i-th input argument, evaluated at theta. (i.e., numgrad(i) should
% be the (approximately) the partial derivative of J with respect
% to theta(i).)
% numgrad = zeros(size(theta));
perturb = zeros(size(theta));
e = 1e-;
for p = :numel(theta)
% Set perturbation vector
perturb(p) = e;
loss1 = J(theta - perturb);
loss2 = J(theta + perturb);
% Compute Numerical Gradient
numgrad(p) = (loss2 - loss1) / (*e);
perturb(p) = ;
end end

7.Regularized Neural Networks

for j=0:

for j>=1:

别人的代码:

https://github.com/jcgillespie/Coursera-Machine-Learning/tree/master/ex4

CheeseZH: Stanford University: Machine Learning Ex4:Training Neural Network(Backpropagation Algorithm)的更多相关文章

  1. CheeseZH: Stanford University: Machine Learning Ex3: Multiclass Logistic Regression and Neural Network Prediction

    Handwritten digits recognition (0-9) Multi-class Logistic Regression 1. Vectorizing Logistic Regress ...

  2. CheeseZH: Stanford University: Machine Learning Ex5:Regularized Linear Regression and Bias v.s. Variance

    源码:https://github.com/cheesezhe/Coursera-Machine-Learning-Exercise/tree/master/ex5 Introduction: In ...

  3. CheeseZH: Stanford University: Machine Learning Ex2:Logistic Regression

    1. Sigmoid Function In Logisttic Regression, the hypothesis is defined as: where function g is the s ...

  4. CheeseZH: Stanford University: Machine Learning Ex1:Linear Regression

    (1) How to comput the Cost function in Univirate/Multivariate Linear Regression; (2) How to comput t ...

  5. Machine Learning, Homework 9, Neural Nets

    Machine Learning, Homework 9, Neural NetsApril 15, 2019ContentsBoston Housing with a Single Layer an ...

  6. 【MetaPruning】2019-ICCV-MetaPruning Meta Learning for Automatic Neural Network Channel Pruning-论文阅读

    MetaPruning 2019-ICCV-MetaPruning Meta Learning for Automatic Neural Network Channel Pruning Zechun ...

  7. MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

    MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning 2019-08-11 19:48:17 Paper: h ...

  8. Stanford CS229 Machine Learning by Andrew Ng

    CS229 Machine Learning Stanford Course by Andrew Ng Course material, problem set Matlab code written ...

  9. Machine Learning No.5: Neural networks

    1. advantage: when number of features is too large, so previous algorithm is not a good way to learn ...

随机推荐

  1. bzoj 2555

    暴力. 收获: 1.第一道后缀自动机,大概知道怎么写了,有一些原理性的东西还要理解. 2.计算right集合的大小 /***************************************** ...

  2. AVL树理解

    AVL树理解 简介 我们知道,AVL树也是平衡树中的一种,是自带平衡条件的二叉树,始终都在维护树的高度,保持着树的高度为logN,同时把插入.查找.删除一个结点的时间复杂度的最好和最坏情况都维持在O( ...

  3. Java中常见的IO流及其使用

    Java中IO流分成两大类,一种是输入流.全部的输入流都直接或间接继承自InputStream抽象类,输入流作为数据的来源.我们能够通过输入流的read方法读取字节数据.还有一种是输出流,全部的输出流 ...

  4. BTSync FREE vs BTSync PRO

    Although both BitTorrent Sync 2.0 FREE and PRO ensure high file transfer speed and ultimate security ...

  5. 2013Esri全球用户大会之ArcGIS for Server&Portal for ArcGIS

    Q1:ArcGIS 10.2 for Server有哪些新特性? ArcGIS 10.2对于ArcGIS for Server来说是一个引人注目的版本.它建立在ArcGIS 10.1扎实雄厚的基础上, ...

  6. linux 内核大牛-谢宝友

    http://blog.chinaunix.net/uid/25845340.html 谢宝友:毕业于四川省税务学校税收专业,现供职于中兴通讯操作系统团队,对操作系统内核有较强的兴趣.专职于操作系统内 ...

  7. Matlab的linprog解决简单线性规划问题

    一个简单的线性规划问题,使用Matlab的linprog解决 假定有n种煤,各种煤的配比为x1,x2,x3,……首先需要满足下列两个约束条件,即 x1+x2+x3……+xn=1 x1≥0, x2≥0, ...

  8. [Ubuntu] ubuntu的tty下挂载移动硬盘拷贝数据

    转载:http://blog.csdn.net/langb2014/article/details/51567460 更换CUDA好多人都更换成功了,我却失败了,然后电脑最后进不了界面了,只有tty端 ...

  9. 0, \0, NULL

    字符串.字符数组输入.输出与'\0'的问题 原创首发,欢迎转载! 作者按 字符串.字符数组以"%s"格式输入时,以遇到'空格'为这个字符串输入结束. 字符串.字符数组以" ...

  10. 【ContestHunter】【弱省胡策】【Round5】

    反演+FFT+构造+DP 写了这么多tag,其实我一个也不会 A 第一题是反演……数据范围10W,看着就有种要用FFT等神奇算法的感觉……然而蒟蒻并不会推公式,只好写了20+10分的暴力,然而特判30 ...