CNN反向传播算法公式
网络结构(6c-2s-12c-2s):
初始化:
\begin{align}\notag
W \sim U(- \frac{\sqrt{6}}{\sqrt{n_j+n_{j+1}}} , \frac{\sqrt{6}}{\sqrt{n_j+n_{j+1}}})
\end{align}
\begin{align}\notag
Var(W_i) = \frac{1}{n_i} ; Var(W_i) = \frac{1}{n_{i+1}} ; Var(W_i) = \frac{1}{n_i + n_{i+1}}
\end{align}
偏置 $ b $ 统一初始化为 $ 0 $ ,权重 $ W $ 设置为 $ random(-1,1)\sqrt{\frac{6}{fan_{in} + fan_{out}}} \sim U(- \frac{\sqrt{6}}{\sqrt{n_j+n_{j+1}}} , \frac{\sqrt{6}}{\sqrt{n_j+n_{j+1}}}) $ , $ n_j $ 表示神经网络的大小, $ fan_{in} = 输入通道数\times卷积核size $ , $ fan_{out} = 输出通道数\times卷积核size $ 。
for l = 1 : numel(net.layers) % layer
if strcmp(net.layers{l}.type, 's')
mapsize = mapsize / net.layers{l}.scale;
assert(all(floor(mapsize)==mapsize), ['Layer ' num2str(l) ' size must be integer. Actual: ' num2str(mapsize)]);
for j = 1 : inputmaps
net.layers{l}.b{j} = 0;
end
end
if strcmp(net.layers{l}.type, 'c')
mapsize = mapsize - net.layers{l}.kernelsize + 1;
fan_out = net.layers{l}.outputmaps * net.layers{l}.kernelsize ^ 2;
for j = 1 : net.layers{l}.outputmaps % output map
fan_in = inputmaps * net.layers{l}.kernelsize ^ 2;
for i = 1 : inputmaps % input map
net.layers{l}.k{i}{j} = (rand(net.layers{l}.kernelsize) - 0.5) * 2 * sqrt(6 / (fan_in + fan_out));
end
net.layers{l}.b{j} = 0;
end
inputmaps = net.layers{l}.outputmaps;
end
end
% 'onum' is the number of labels, that's why it is calculated using size(y, 1). If you have 20 labels so the output of the network will be 20 neurons.
% 'fvnum' is the number of output neurons at the last layer, the layer just before the output layer.
% 'ffb' is the biases of the output neurons.
% 'ffW' is the weights between the last layer and the output neurons. Note that the last layer is fully connected to the output layer, that's why the size of the weights is (onum * fvnum)
fvnum = prod(mapsize) * inputmaps;
onum = size(y, 1);
net.ffb = zeros(onum, 1);
net.ffW = (rand(onum, fvnum) - 0.5) * 2 * sqrt(6 / (onum + fvnum));
前向传播:
\begin{align}\notag
x_j^l = f(\sum_ {i\in M_j} x_i^{l-1} * k_{ij}^l + b_j^l)
\end{align}
% !!below can probably be handled by insane matrix operations
for j = 1 : net.layers{l}.outputmaps % for each output map
% create temp output map
z = zeros(size(net.layers{l - 1}.a{1}) - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1 0]);
for i = 1 : inputmaps % for each input map
% convolve with corresponding kernel and add to temp output map
z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j}, 'valid');
end
% add bias, pass through nonlinearity
net.layers{l}.a{j} = sigm(z + net.layers{l}.b{j});
end
% set number of input maps to this layers number of outputmaps
inputmaps = net.layers{l}.outputmaps;
前向传播:
\begin{align}\notag
x_j^l = f(\beta_j^l down(x_j^{l-1}) + b_j^l)
\end{align}
% downsample
for j = 1 : inputmaps
z = convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) / (net.layers{l}.scale ^ 2), 'valid'); % !! replace with variable
net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 : net.layers{l}.scale : end, :);
end
前向传播:
% concatenate all end layer feature maps into vector
net.fv = [];
for j = 1 : numel(net.layers{n}.a)
sa = size(net.layers{n}.a{j});
net.fv = [net.fv; reshape(net.layers{n}.a{j}, sa(1) * sa(2), sa(3))];
end
% feedforward into output perceptrons
net.o = sigm(net.ffW * net.fv + repmat(net.ffb, 1, size(net.fv, 2)));
sigmoid函数求导:
\begin{align}\notag
f(x) = \frac{1}{1+e^{-x}} ; f^\prime(x) = \frac{e^{-x}}{(1+e^{-x})^2} = f(x) \cdot [1 - f(x)]
\end{align}
对网络的最后一层输出层,计算输出值和样本值得残差:
\begin{align}\notag
\delta^n = -(y-a^n)\cdot f^\prime(z^n)
\end{align}
% error
net.e = net.o - y;
%% backprop deltas
net.od = net.e .* (net.o .* (1 - net.o)); % output delta
对于隐层 $ l = n-1,n-2,n-3,...,2 $ ,计算各节点残差:
\begin{align}\notag
\delta^l = ({(W^l)}^T \delta^{l+1}) \cdot f^\prime(z^l)
\end{align}
% concatenate all end layer feature maps into vector
net.fv = [];
for j = 1 : numel(net.layers{n}.a)
sa = size(net.layers{n}.a{j});
net.fv = [net.fv; reshape(net.layers{n}.a{j}, sa(1) * sa(2), sa(3))];
end
net.fvd = (net.ffW' * net.od); % feature vector delta
if strcmp(net.layers{n}.type, 'c') % only conv layers has sigm function
net.fvd = net.fvd .* (net.fv .* (1 - net.fv));
end
反向传播:
\begin{align}\notag
\delta_j^l = f^\prime(u_j^l)\circ conv2(\delta_j^{l+1},rot180(k_j^{l+1}),'full')
\end{align}
for i = 1 : numel(net.layers{l}.a)
z = zeros(size(net.layers{l}.a{1}));
for j = 1 : numel(net.layers{l + 1}.a)
z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l + 1}.k{i}{j}), 'full');
end
net.layers{l}.d{i} = z;
end
反向传播:
\begin{align}\notag
\delta_j^l = \beta_j^{l+1}(f^\prime(u_j^l) \circ up(\delta_j^{l+1}))
\end{align}
for j = 1 : numel(net.layers{l}.a)
net.layers{l}.d{j} = net.layers{l}.a{j} .* (1 - net.layers{l}.a{j}) .* (expand(net.layers{l + 1}.d{j}, [net.layers{l + 1}.scale net.layers{l + 1}.scale 1]) / net.layers{l + 1}.scale ^ 2);
end
计算最终需要的偏导数值:
\begin{align}\notag
\nabla_{W^l}J(W,b;x,y) = \delta^{l+1}(a^l)^T
\end{align}
\begin{align}\notag
\nabla_{b^l}J(W,b;x,y) = \delta^{l+1}
\end{align}
\begin{align}\notag
\nabla_{W^l}J(W,b) = [\frac{1}{m}\sum_{i=1}^m\nabla_{W^l}J(W,b;x,y)]+\lambda W_{ij}^l
\end{align}
\begin{align}\notag
\nabla_{b^l}J(W,b) = \frac{1}{m}\sum_{i=1}^m\nabla_{b^l}J(W,b;x,y)
\end{align}
\begin{align}\notag
\frac{\partial E}{\partial k_{ij}^l} = rot180(conv2(x_i^{l-1},rot180(\delta_j^l),'valid'))
\end{align}
\begin{align}\notag
\frac{\partial E}{\partial b_j} = \sum_{u,v}(\delta_j^l)_{uv}
\end{align}
for l = 2 : n
if strcmp(net.layers{l}.type, 'c')
for j = 1 : numel(net.layers{l}.a)
for i = 1 : numel(net.layers{l - 1}.a)
net.layers{l}.dk{i}{j} = convn(flipall(net.layers{l - 1}.a{i}), net.layers{l}.d{j}, 'valid') / size(net.layers{l}.d{j}, 3);
end
net.layers{l}.db{j} = sum(net.layers{l}.d{j}(:)) / size(net.layers{l}.d{j}, 3);
end
end
end
net.dffW = net.od * (net.fv)' / size(net.od, 2);
net.dffb = mean(net.od, 2);
CNN反向传播算法公式的更多相关文章
- 卷积神经网络(CNN)反向传播算法
在卷积神经网络(CNN)前向传播算法中,我们对CNN的前向传播算法做了总结,基于CNN前向传播算法的基础,我们下面就对CNN的反向传播算法做一个总结.在阅读本文前,建议先研究DNN的反向传播算法:深度 ...
- CNN反向传播更新权值
背景 反向传播(Backpropagation)是训练神经网络最通用的方法之一,网上有许多文章尝试解释反向传播是如何工作的,但是很少有包括真实数字的例子,这篇博文尝试通过离散的数据解释它是怎样工作的. ...
- CNN反向传播算法过程
主模块 规格数据输入(加载,调格式,归一化) 定义网络结构 设置训练参数 调用初始化模块 调用训练模块 调用测试模块 画图 初始化模块 设置初始化参数(输入通道,输入尺寸) 遍历层(计算尺寸,输入输出 ...
- CNN中卷积层 池化层反向传播
参考:https://blog.csdn.net/kyang624823/article/details/78633897 卷积层 池化层反向传播: 1,CNN的前向传播 a)对于卷积层,卷积核与输入 ...
- CNN的反向传播
在一般的全联接神经网络中,我们通过反向传播算法计算参数的导数.BP 算法本质上可以认为是链式法则在矩阵求导上的运用.但 CNN 中的卷积操作则不再是全联接的形式,因此 CNN 的 BP 算法需要在原始 ...
- CNN压缩:为反向传播添加mask(caffe代码修改)
神经网络压缩的研究近三年十分热门,笔者查阅到相关的两篇博客,博主们非常奉献的提供了源代码,但是发发现在使用gpu训练添加mask的网络上,稍微有些不顺,特此再进行详细说明. 此文是在 基于Caffe的 ...
- 《神经网络的梯度推导与代码验证》之CNN前向和反向传播过程的代码验证
在<神经网络的梯度推导与代码验证>之CNN的前向传播和反向梯度推导 中,我们学习了CNN的前向传播和反向梯度求导,但知识仍停留在纸面.本篇章将基于深度学习框架tensorflow验证我们所 ...
- CNN卷积层基础:特征提取+卷积核+反向传播
本篇介绍卷积层的线性部分 一.与全连接层相比卷积层有什么优势? 卷积层可以节省参数,因为卷积运算利用了图像的局部相关性——分析出一小片区域的特点,加上Pooling层(汇集.汇聚),从附近的卷积结果中 ...
- 神经网络训练中的Tricks之高效BP(反向传播算法)
神经网络训练中的Tricks之高效BP(反向传播算法) 神经网络训练中的Tricks之高效BP(反向传播算法) zouxy09@qq.com http://blog.csdn.net/zouxy09 ...
随机推荐
- Operating systems Chapter 4
There are two processes to switch, when one run:io instruction, switch on other process. After ios, ...
- Python中的浅复制、深复制
参考 https://docs.python.org/3/library/copy.html?highlight=copy%20copy#copy.copy https://en.wikipedia. ...
- pillow 初级用法
# 转载至:https://www.cnblogs.com/apexchu/p/4231041.html Image类 Pillow中最重要的类就是Image,该类存在于同名的模块中.可以通过以下几种 ...
- sklearn笔记:决策树
概述 sklearn中决策树的类都在 tree 这个模块下.这个模块总共包含五个类: tree.DecisionTreeClassifier:分类树 tree.DecisionTreeRegresso ...
- webpack使用devtool :source map插件
链接 : https://www.cnblogs.com/chris-oil/p/8856020.html
- 如何查看本机的oracle数据库的IP地址 和 数据库名
1,如果是本机的oracle数据库,ip就为127.0.0.1,数据库名看tnsname.ora文件 LISTENER_ORCL = (ADDRESS = (PROTOCOL = TCP)(HOST ...
- 解决Vue 使用vue-router切换页面时 页面显示没有在顶部的问题
有时候我们需要页面滚动条滚动到某一固定的位置,一般使用Window scrollTo() 方法. 语法就是:scrollTo(xpos,ypos) xpos:必需.要在窗口文档显示区左上角显示的文档的 ...
- 重新梳理IT知识之java-01语法(一)
标识符的命名规范 包名:xxxyyyzzz 类名.接口名:XxxYyyZzz (大驼峰) 变量名.方法名:xxxYyyZzz 常量名:XXX_YYY_ZZZ //**************强制类型转 ...
- 【原】jenkins知识点_凭据(一)
一:凭据 1.目的: 与第三方网站或应用程序进行交互,如代码仓库.云存储系统和服务等 2.操作路径: Jenkins-凭据-系统-全局凭据 3.权限 Jenkins 中保存的凭证可以用于: 任何适用于 ...
- 疫情对国内5G发展的影响
导读 2020年春节期间,“新型冠状病毒”引发了自SARS之后又一次全国性疫情,而世卫组织也将之列为国际关注的突发公共卫生事件,随后多国发布了针对中国的旅行警告和入境限制,从当前情况来看,疫情对中国各 ...