caffe卷积层实现
下图是jiayangqing在知乎上的回答,其实过程就是把image转换成矩阵,然后进行矩阵运算

卷积的实现在conv_layer层,conv_layer层继承了base_conv_layer层,base_conv_layer层是卷积操作的基类,包含卷积和反卷积.conv_layer层的前向传播是通过forward_cpu_gemm函数实现,这个函数在vision_layer.hpp里进行了定义,在base_conv_layer.cpp里进行了实现.forward_cpu_gemm函数调用了caffe_cpu_gemm和conv_im2col_cpu,caffe_cpu_gemm的实现在util/math_function.cpp里,conv_im2col_cpu的实现在util/im2col.cpp里
conv_layer层的前向传播代码,先调forward_cpu_gemm函数进行卷积的乘法运算,然后根据是否需要bias项,调用forward_cpu_bias进行加法运算.
template <typename Dtype>
void ConvolutionLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
const Dtype* weight = this->blobs_[]->cpu_data();
for (int i = ; i < bottom.size(); ++i) {
const Dtype* bottom_data = bottom[i]->cpu_data();
Dtype* top_data = top[i]->mutable_cpu_data();
for (int n = ; n < this->num_; ++n) {
this->forward_cpu_gemm(bottom_data + bottom[i]->offset(n), weight,
top_data + top[i]->offset(n));
if (this->bias_term_) {
const Dtype* bias = this->blobs_[]->cpu_data();
this->forward_cpu_bias(top_data + top[i]->offset(n), bias);
}
}
}
}
num_在vision_layers.hpp中的BaseConvolutionLayer类中定义,表示batchsize(https://blog.csdn.net/sinat_22336563/article/details/69808612,这个博客给做了说明).也就是说.每一层卷积是先把一个batch所有的数据计算完才传给下一层,不是把batch中的一个在整个网络中计算一次,再把batch中的下一个传进整个网络进行计算.这里的bottom[i]->offset(n),相当于指向一个batch中下一个图片或者feature map的内存地址,即对batch中下一个进行计算
offset在blob.hpp中定义.bottom、top都是blob类,所以可以去调用blob这个类的属性或者方法,看具体实现的时候直接去看这个类怎么实现的就OK.
inline int offset(const int n, const int c = , const int h = ,
const int w = ) const {
CHECK_GE(n, );
CHECK_LE(n, num());
CHECK_GE(channels(), );
CHECK_LE(c, channels());
CHECK_GE(height(), );
CHECK_LE(h, height());
CHECK_GE(width(), );
CHECK_LE(w, width());
return ((n * channels() + c) * height() + h) * width() + w;
} inline int offset(const vector<int>& indices) const {
CHECK_LE(indices.size(), num_axes());
int offset = ;
for (int i = ; i < num_axes(); ++i) {
offset *= shape(i);
if (indices.size() > i) {
CHECK_GE(indices[i], );
CHECK_LT(indices[i], shape(i));
offset += indices[i];
}
}
return offset;
}

https://www.cnblogs.com/neopenx/p/5294682.html
forward_cpu_gemm、forward_cpu_bias函数的实现.is_1x1_用来判断是不是1x1的卷积操作,skip_im2col用来判断是不是需要把图片转换成矩阵
template <typename Dtype>
void BaseConvolutionLayer<Dtype>::forward_cpu_gemm(const Dtype* input,
const Dtype* weights, Dtype* output, bool skip_im2col) {
const Dtype* col_buff = input;
if (!is_1x1_) {
if (!skip_im2col) {
conv_im2col_cpu(input, col_buffer_.mutable_cpu_data());
}
col_buff = col_buffer_.cpu_data();
}
for (int g = ; g < group_; ++g) {
caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, conv_out_channels_ /
group_, conv_out_spatial_dim_, kernel_dim_ / group_,
(Dtype)., weights + weight_offset_ * g, col_buff + col_offset_ * g,
(Dtype)., output + output_offset_ * g);
}
} template <typename Dtype>
void BaseConvolutionLayer<Dtype>::forward_cpu_bias(Dtype* output,
const Dtype* bias) {
caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, num_output_,
height_out_ * width_out_, , (Dtype)., bias, bias_multiplier_.cpu_data(),
(Dtype)., output);
}
conv_im2col_cpu
inline void conv_im2col_cpu(const Dtype* data, Dtype* col_buff) {
im2col_cpu(data, , conv_in_channels_, conv_in_height_, conv_in_width_,
kernel_h_, kernel_w_, pad_h_, pad_w_, stride_h_, stride_w_,
, , col_buff);
}
template <typename Dtype>
void im2col_cpu(const Dtype* data_im, const int channels,
const int height, const int width, const int kernel_h, const int kernel_w,
const int pad_h, const int pad_w,
const int stride_h, const int stride_w,
const int dilation_h, const int dilation_w,
Dtype* data_col) {
const int output_h = (height + * pad_h -
(dilation_h * (kernel_h - ) + )) / stride_h + ;
const int output_w = (width + * pad_w -
(dilation_w * (kernel_w - ) + )) / stride_w + ;
const int channel_size = height * width;
for (int channel = channels; channel--; data_im += channel_size) {
for (int kernel_row = ; kernel_row < kernel_h; kernel_row++) {
for (int kernel_col = ; kernel_col < kernel_w; kernel_col++) {
int input_row = -pad_h + kernel_row * dilation_h;
for (int output_rows = output_h; output_rows; output_rows--) {
if (!is_a_ge_zero_and_a_lt_b(input_row, height)) {
for (int output_cols = output_w; output_cols; output_cols--) {
*(data_col++) = ;
}
} else {
int input_col = -pad_w + kernel_col * dilation_w;
for (int output_col = output_w; output_col; output_col--) {
if (is_a_ge_zero_and_a_lt_b(input_col, width)) {
*(data_col++) = data_im[input_row * width + input_col];
} else {
*(data_col++) = ;
}
input_col += stride_w;
}
}
input_row += stride_h;
}
}
}
}
}
caffe_cpu_gemm函数的实现.cblas_sgemm是cblas库的一个函数(# inclue<cblas.h>),
template<>
void caffe_cpu_gemm<float>(const CBLAS_TRANSPOSE TransA,
const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,
const float alpha, const float* A, const float* B, const float beta,
float* C) {
int lda = (TransA == CblasNoTrans) ? K : M;
int ldb = (TransB == CblasNoTrans) ? N : K;
cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,
ldb, beta, C, N);
} template<>
void caffe_cpu_gemm<double>(const CBLAS_TRANSPOSE TransA,
const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,
const double alpha, const double* A, const double* B, const double beta,
double* C) {
int lda = (TransA == CblasNoTrans) ? K : M;
int ldb = (TransB == CblasNoTrans) ? N : K;
cblas_dgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,
ldb, beta, C, N);
} template<>
void caffe_cpu_gemm<Half>(const CBLAS_TRANSPOSE TransA,
const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,
const Half alpha, const Half* A, const Half* B, const Half beta,
Half* C) {
DeviceMemPool& pool = Caffe::Get().cpu_mem_pool();
float* fA = (float*)pool.Allocate(M*K*sizeof(float));
float* fB = (float*)pool.Allocate(K*N*sizeof(float));
float* fC = (float*)pool.Allocate(M*N*sizeof(float));
halves2floats_cpu(A, fA, M*K);
halves2floats_cpu(B, fB, K*N);
halves2floats_cpu(C, fC, M*N);
caffe_cpu_gemm<float>(TransA, TransB, M, N, K, float(alpha), fA, fB, float(beta), fC);
floats2halves_cpu(fC, C, M*N);
pool.Free((char*)fA);
pool.Free((char*)fB);
pool.Free((char*)fC);
}
caffe卷积层实现的更多相关文章
- caffe 卷积层的运算
贾清扬寻找快速算法之路:https://github.com/Yangqing/caffe/wiki/Convolution-in-Caffe:-a-memo 卷积运算图文并茂:http://www. ...
- caffe卷积层代码阅读笔记
卷积的实现思想: 通过im2col将image转为一个matrix,将卷积操作转为矩阵乘法运算 通过调用GEMM完毕运算操作 以下两个图是我在知乎中发现的,"盗"用一下,确实非常好 ...
- caffe之(一)卷积层
在caffe中,网络的结构由prototxt文件中给出,由一些列的Layer(层)组成,常用的层如:数据加载层.卷积操作层.pooling层.非线性变换层.内积运算层.归一化层.损失计算层等:本篇主要 ...
- caffe源码 卷积层
通俗易懂理解卷积 图示理解神经网络的卷积 input: 3 * 5 * 5 (c * h * w) pading: 1 步长: 2 卷积核: 2 * 3 * 3 * 3 ( n * c * k * k ...
- TensorFlow与caffe中卷积层feature map大小计算
刚刚接触Tensorflow,由于是做图像处理,因此接触比较多的还是卷及神经网络,其中会涉及到在经过卷积层或者pooling层之后,图像Feature map的大小计算,之前一直以为是与caffe相同 ...
- caffe Python API 之卷积层(Convolution)
1.Convolution层: 就是卷积层,是卷积神经网络(CNN)的核心层. 层类型:Convolution lr_mult: 学习率的系数,最终的学习率是这个数乘以solver.prototxt配 ...
- 【caffe】卷积层代码解析
1.Forward_cpu conv_layer.cpp template <typename Dtype> void ConvolutionLayer<Dtype>::For ...
- caffe中卷积层和pooling层计算下一层的特征map的大小
pool层,其中ceil是向上取整函数 卷积层:
- caffe中全卷积层和全连接层训练参数如何确定
今天来仔细讲一下卷基层和全连接层训练参数个数如何确定的问题.我们以Mnist为例,首先贴出网络配置文件: name: "LeNet" layer { name: "mni ...
随机推荐
- TOJ 1258 Very Simple Counting
Description Let f(n) be the number of factors of integer n. Your task is to count the number of i(1 ...
- Go语言下载网络图片或文件
最近闲来无事, 于是就简单学习了下Go语言的基本的用法.由于实践才是最快的学习方法,所以这里就以下载网络图片或文件入手来学习Go语言 文件下载到本地,通常的思路就是先获得网络文件的 输入流 以及本地文 ...
- The Definitive C++ Book Guide and List--reference
http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list Reference Style - All ...
- Java学习第二十二天
1:登录注册IO版本案例(掌握) 要求,对着写一遍. cn.itcast.pojo User cn.itcast.dao UserDao cn.itcast.dao.impl UserDaoImpl( ...
- jQuery中的事件——《锋利的JQuery》
虽然利用原生的JavaScript事件能完成一些交互,但jQuery增加并扩展了基本的事件处理机制.jQuery不仅提供了更加优雅的事件处理语法,而且极大地增强了事件处理能力. 1.加载DOM 在Ja ...
- Linux Tomcat 80端口 Port 80 required by Tomcat v8.5 Server at localhost is already in use.
Port 80 required by Tomcat v8.5 Server at localhost is already in use. The server may already be run ...
- (六-1)Firefox插件安装
1.安装火狐插件 ①旧版本firefox 火狐浏览器右上角-->附件-->获取更多附件-->搜索-->Firebug 安装 Firebug 扩展:https://addons. ...
- 【VMware】VMware的安装和更改虚拟机默认存储路径
1.VMware Workstation 14中文破解版安装 网盘下载链接:https://pan.baidu.com/s/1pbDXpgpNQTdTRzzKNWfE1A linux镜像(CentOS ...
- python学习(六)--正则的一些例子
import re #正则表达式#compile函数,--将正则表达式转变为内部函数,提高执行效率strr = "python123456"pattern = "Pyth ...
- main方法击破
什么是main方法? 是类中的一段代码,可以让程序独立运行. public class HelloWord{ public static void main(String[] args) { for ...