Reference: https://www.cs.swarthmore.edu/~meeden/cs81/s10/BackPropDeriv.pdf

  I spent nearly one hour to deduce the vector form of the back propagation. Just in case that I may forget, but need to utilize them, I will write down all the formula here to make a backup.

Structure:

  Standard BP Network with $\displaystyle \lambda$ hidden layers, one input layer and one output layer.

  Activation function: sigmoid.

Notations:

$\displaystyle W^{i+1,i}$, denotes the weight matrix connecting from $i$th layer to $i+1$th layer.

$\displaystyle N^i$, denotes the net input of the $i$th layer.

$\displaystyle A^i$, denotes the activation input of the $i$th layer.

$\displaystyle \delta ^i$, denotes the error of the $i$th layer.

$\displaystyle \epsilon$, denotes the learning rate.

*, stands for element by element multiplication.

(omit), stands for matrix multiplication.

  Specifically,

$\displaystyle X$, denotes the input layer, while equals $\displaystyle A^0$.

$\displaystyle A^{\lambda + 1}$, denotes the output layer.

$\displaystyle Y$, denotes the expected output.

Propagations:

  Forward:

$\displaystyle N^i = W^{i,i-1}A^{i-1}$.

$\displaystyle A^i = \frac{1}{1+e^{-N^i}}$.

  Backward:

$\displaystyle \Delta W^{i+1,i} = \epsilon \delta^{i+1}(A^{i})^{T}$.

$\displaystyle \delta ^i = ((\delta^{i+1})^{T}W^{i+1,i})^{T}*A^{i}*(1-A^{i})$.

$\displaystyle \delta ^{\lambda + 1} = (Y - A^{\lambda + 1})*A^{\lambda + 1}*(1-A^{\lambda + 1})$.

Deduction:

  I am not capable of taking the partial derivative of vector or matrix over vector or matrix, so I derive these formulas by observing the formula for each element in the matrix and extend it to the vector form.

$\displaystyle \Delta W^{\lambda+1,\lambda}_{i,j} = \epsilon (Y_i - A^{\lambda+1}_i)A^{\lambda+1}_i(1-A^{\lambda +1}_i)A^{\lambda}_j$.

  Let's assume $\displaystyle \delta ^{\lambda+1}_{i} := (Y_i - A^{\lambda+1}_i)A^{\lambda+1}_i(1-A^{\lambda +1}_i)$.

$\displaystyle \Delta W^{\lambda,\lambda-1}_{i,j}=\epsilon (\delta^{\lambda+1})^{T}W^{\lambda+1,\lambda}_{col(i)}A_i^{\lambda}(1-A_i^{\lambda})A_j^{\lambda-1}$.

  Let's assume $\displaystyle \delta ^{\lambda}_{i} := (\delta^{\lambda+1})^{T}W^{\lambda+1,\lambda}_{col(i)}A_i^{\lambda}(1-A_i^{\lambda})$.

  The left are reserved for the readers to complete.

[Machine Learning][BP]The Vectorized Back Propagation Algorithm的更多相关文章

  1. CheeseZH: Stanford University: Machine Learning Ex4:Training Neural Network(Backpropagation Algorithm)

    1. Feedforward and cost function; 2.Regularized cost function: 3.Sigmoid gradient The gradient for t ...

  2. Bayesian machine learning

    from: http://www.metacademy.org/roadmaps/rgrosse/bayesian_machine_learning Created by: Roger Grosse( ...

  3. 机器学习算法之旅A Tour of Machine Learning Algorithms

    In this post we take a tour of the most popular machine learning algorithms. It is useful to tour th ...

  4. [GPU] Machine Learning on C++

    一.MPI为何物? 初步了解:MPI集群环境搭建 二.重新认识Spark 链接:https://www.zhihu.com/question/48743915/answer/115738668 马铁大 ...

  5. A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning

    A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning by Jason Brownlee on S ...

  6. Machine Learning—Mixtures of Gaussians and the EM algorithm

    印象笔记同步分享:Machine Learning-Mixtures of Gaussians and the EM algorithm

  7. AUTOML --- Machine Learning for Automated Algorithm Design.

    自动算法的机器学习: Machine Learning for Automated Algorithm Design. http://www.ml4aad.org/ AutoML——降低机器学习门槛的 ...

  8. (转)Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning

    Introduction Optimization is always the ultimate goal whether you are dealing with a real life probl ...

  9. machine learning model(algorithm model) .vs. statistical model

    https://www.analyticsvidhya.com/blog/2015/07/difference-machine-learning-statistical-modeling/ http: ...

随机推荐

  1. python3 使用selenium +webdriver打开chrome失败,报错:FileNotFoundError: [Errno 2] No such file or directory: 'chromedriver': 'chromedriver'

    提示chrome driver没有放置在正确的路径下 解决方法: 1.chromedriver与chrome各版本及下载地址 驱动的下载地址如下: http://chromedriver.storag ...

  2. 分段控制器UISegmentedControl的使用、同一个控制器中实现多个View的切换、addChildViewController等方法的使用

    本文先讲解简单的分段控制器UISegmentedControl的使用,然后具体讲解它最常使用的场景:同一个控制器中实现多个View的切换. 文章构思: 1.先直接讲解一张UI效果图的四种实现方式. 2 ...

  3. smoj2828子数组有主元素

    题面 一个数组B,如果有其中一个元素出现的次数大于length(B) div 2,那么该元素就是数组B的主元素,显然数组B最多只有1个主元素,因为数组B有主元素,所以被称为"优美的" ...

  4. 攻防世界web进阶区(2)--记一次sql注入

    题目地址:http://111.198.29.45:56094 这是一道sql注入题. 试试1' order by 3#,发现页面显示正常,将3换为4时,页面报错,则说明含有3个字段. 接下来判断输出 ...

  5. vue移动端transition兼容

    vue移动端transition兼容 .face-recognition .wrapper(:style="{height: viewHeight+'px'}") .face-re ...

  6. Centos7 网卡Device does not seem to be present解决办法

    1.ifconfig -a 查看当前所有网卡 2.修改网络配置文件 3.在原来文件的基础上,修改网卡名称 DEVICE=ens32 NAME=ens32 并且把UUID以及mac地址删掉 mv ifc ...

  7. IOS TableView 用法

    1.在视图上创建TableView( 拖控件),为ViewController创建UITableView属性(链接至TableView)和NSArray属性(存储数据) ViewController. ...

  8. 记-ItextPDF+freemaker 生成PDF文件---导致服务宕机

    摘要:已经上线的项目,出现服务挂掉的情况. 介绍:该服务是专门做打印的,业务需求是生成PDF文件进行页面预览,主要是使用ItextPDF+freemaker技术生成一系列PDF文件,其中生成流程有:解 ...

  9. swoole之任务和定时器

    一.代码 <?php use Swoole\Server; /** * 面向对象的形式 + task + timer */ class WebSocket { public $server; p ...

  10. No 'Access-Control-Allow-Origin'跨域问题- (mysql-thinkphp) (6)

    因为ajax请求一个服务的时候,服务器端,比如thinkphp端,或者java框架,它会检测,你请求时候的域名,就是http请求的时候,request header不是会把客户端的Request UR ...