The goal of backpropagation is to compute the partial derivatives ∂C/∂w and ∂C/∂b of the cost function C with respect to any weight ww or bias b in the network.

we use the quadratic cost function

   

two assumptions :

  1: The first assumption we need is that the cost function can be written as an average

        (case for the quadratic cost function)

    The reason we need this assumption is because what backpropagation actually lets us do is compute the partial derivatives

  ∂Cx/∂w and ∂Cx/∂b for a single training example. We then recover ∂C/∂w and ∂C/∂b by averaging over training examples. In

  fact, with this assumption in mind, we'll suppose the training example x has been fixed, and drop the x subscript, writing the

  cost Cx as C. We'll eventually put the x back in, but for now it's a notational nuisance that is better left implicit.

  2: The cost function can be written as a function of the outputs from the neural network

  

the Hadamard product

   (s⊙t)j=sjtj(s⊙t)j=sjtj

  

The four fundamental equations behind backpropagation

  

BP1 

   :the error in the jth neuron in the lth layer

     

    You might wonder why the demon is changing the weighted input zlj. Surely it'd be more natural to imagine the demon changing

   the output activation alj, with the result that we'd be using ∂C/∂alj as our measure of error. In fact, if you do this things work out quite

  similarly to the discussion below. But it turns out to make the presentation of backpropagation a little more algebraically complicated.

   So we'll stick with δlj=∂C/∂zlj as our measure of error.

  An equation for the error in the output layer, δL: The components of δL are given by

  

  it's easy to rewrite the equation in a matrix-based form, as

  

  

  

BP2

  

  

  

BP3

  

  

BP4

  

  

  

The backpropagation algorithm

  

    

      Of course, to implement stochastic gradient descent in practice you also need an outer loop generating mini-batches

    of training examples, and an outer loop stepping through multiple epochs of training. I've omitted those for simplicity.

reference: http://neuralnetworksanddeeplearning.com/chap2.html

------------------------------------------------------------------------------------------------

reference:Machine Learning byAndrew Ng

review backpropagation的更多相关文章

  1. (Review cs231n) Backpropagation and Neural Network

    损失由两部分组成: 数据损失+正则化损失(data loss + regularization) 想得到损失函数关于权值矩阵W的梯度表达式,然后进性优化操作(损失相当于海拔,你在山上的位置相当于W,你 ...

  2. A review of learning in biologically plausible spiking neural networks

    郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布! Contents: ABSTRACT 1. Introduction 2. Biological background 2.1. Spik ...

  3. Deep Learning论文翻译(Nature Deep Review)

    原论文出处:https://www.nature.com/articles/nature14539 by Yann LeCun, Yoshua Bengio & Geoffrey Hinton ...

  4. 我们是怎么做Code Review的

    前几天看了<Code Review 程序员的寄望与哀伤>,想到我们团队开展Code Review也有2年了,结果还算比较满意,有些经验应该可以和大家一起分享.探讨.我们为什么要推行Code ...

  5. Code Review 程序员的寄望与哀伤

    一个程序员,他写完了代码,在测试环境通过了测试,然后他把它发布到了线上生产环境,但很快就发现在生产环境上出了问题,有潜在的 bug. 事后分析,是生产环境的一些微妙差异,使得这种 bug 场景在线下测 ...

  6. AutoMapper:Unmapped members were found. Review the types and members below. Add a custom mapping expression, ignore, add a custom resolver, or modify the source/destination type

    异常处理汇总-后端系列 http://www.cnblogs.com/dunitian/p/4523006.html 应用场景:ViewModel==>Mode映射的时候出错 AutoMappe ...

  7. Git和Code Review流程

    Code Review流程1.根据开发任务,建立git分支, 分支名称模式为feature/任务名,比如关于API相关的一项任务,建立分支feature/api.git checkout -b fea ...

  8. 神经网络与深度学习(3):Backpropagation算法

    本文总结自<Neural Networks and Deep Learning>第2章的部分内容. Backpropagation算法 Backpropagation核心解决的问题: ∂C ...

  9. 故障review的一些总结

    故障review的一些总结 故障review的目的 归纳出现故障产生的原因 检查故障的产生是否具有普遍性,并尽可能的保证同类问题不在出现, 回顾故障的处理流程,并检查处理过程中所存在的问题.并确定此类 ...

随机推荐

  1. UVA - 11107 Life Forms (广义后缀自动机)

    题意:给你n个字符串,求出在超过一半的字符串中出现的所有子串中最长的子串,按字典序输出. 对这n个字符串建广义后缀自动机,建完后每个字符串在自动机上跑一遍,沿fail树向上更新所有子串结点的出现次数( ...

  2. Django 登录页面重定向

    转自:http://blog.chedushi.com/archives/3484 登陆和注销操作在网页编程上很常见,这两个操作经常需要在操作成功以后转入发出请求的页面. 比如用户正在浏览一篇文章,发 ...

  3. 「LOJ#10051」「一本通 2.3 例 3」Nikitosh 和异或(Trie

    题目描述 原题来自:CODECHEF September Challenge 2015 REBXOR 1​​≤r​1​​<l​2​​≤r​2​​≤N,x⨁yx\bigoplus yx⨁y 表示 ...

  4. Operating System-进程间互斥的问题-生产者&&消费者引入

    之前介绍的几种解决进程间互斥的方案,不管是Peterson方案还是TSL指令的方式,都有一个特点:当一个进程被Block到临界区外面时,被Block的进程会一直处于忙等待的状态,这个不但浪费了CPU资 ...

  5. bzoj 5092 [Lydsy1711月赛]分割序列——高维前缀和

    题目:https://www.lydsy.com/JudgeOnline/problem.php?id=5092 套路地弄一个前缀异或和,就变成 f[ i ]=max_{j=0}^{i} { s[ j ...

  6. bzoj 1997 [Hnoi2010]Planar——2-SAT+平面图的一个定理

    题目:https://www.lydsy.com/JudgeOnline/problem.php?id=1997 平面图的一个定理:若边数大于(3*点数-6),则该图不是平面图. 然后就可以2-SAT ...

  7. phpBB安装

    要测试一个网站的安全性,不得不安装一个网站.常用的Hello World!不行了,找了个phpBB安装.非常方便,记录一下安装过程. 下载phpBB 下载地址:http://tianjin.mycod ...

  8. 机器学习:评价分类结果(F1 Score)

    一.基础 疑问1:具体使用算法时,怎么通过精准率和召回率判断算法优劣? 根据具体使用场景而定: 例1:股票预测,未来该股票是升还是降?业务要求更精准的找到能够上升的股票:此情况下,模型精准率越高越优. ...

  9. 深入浅出 消息队列 ActiveMQ------增强版

    本小节我们将讲解Apache开源下的ActiveMQ,而ActiveMQ是JMS的一个具体实现.JMS即Java消息服务(Java Message Service)应用程序接口是一个Java平台中关于 ...

  10. #define和const的区别

    下面使用#define和const定义常量: #define n_define 10 int main(int argc, char* argv[],int _version) { ; int *p= ...