caffe卷积层实现

下图是jiayangqing在知乎上的回答,其实过程就是把image转换成矩阵,然后进行矩阵运算

卷积的实现在conv_layer层,conv_layer层继承了base_conv_layer层,base_conv_layer层是卷积操作的基类,包含卷积和反卷积.conv_layer层的前向传播是通过forward_cpu_gemm函数实现,这个函数在vision_layer.hpp里进行了定义,在base_conv_layer.cpp里进行了实现.forward_cpu_gemm函数调用了caffe_cpu_gemm和conv_im2col_cpu,caffe_cpu_gemm的实现在util/math_function.cpp里,conv_im2col_cpu的实现在util/im2col.cpp里

conv_layer层的前向传播代码,先调forward_cpu_gemm函数进行卷积的乘法运算,然后根据是否需要bias项,调用forward_cpu_bias进行加法运算.

template <typename Dtype>

void ConvolutionLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,

      const vector<Blob<Dtype>*>& top) {

  const Dtype* weight = this->blobs_[]->cpu_data();

  for (int i = ; i < bottom.size(); ++i) {

    const Dtype* bottom_data = bottom[i]->cpu_data();

    Dtype* top_data = top[i]->mutable_cpu_data();

    for (int n = ; n < this->num_; ++n) {

      this->forward_cpu_gemm(bottom_data + bottom[i]->offset(n), weight,

          top_data + top[i]->offset(n));

      if (this->bias_term_) {

        const Dtype* bias = this->blobs_[]->cpu_data();

        this->forward_cpu_bias(top_data + top[i]->offset(n), bias);

      }

    }

  }

}

num_在vision_layers.hpp中的BaseConvolutionLayer类中定义,表示batchsize(https://blog.csdn.net/sinat_22336563/article/details/69808612,这个博客给做了说明).也就是说.每一层卷积是先把一个batch所有的数据计算完才传给下一层,不是把batch中的一个在整个网络中计算一次,再把batch中的下一个传进整个网络进行计算.这里的bottom[i]->offset(n),相当于指向一个batch中下一个图片或者feature map的内存地址,即对batch中下一个进行计算

offset在blob.hpp中定义.bottom、top都是blob类,所以可以去调用blob这个类的属性或者方法,看具体实现的时候直接去看这个类怎么实现的就OK.

  inline int offset(const int n, const int c = , const int h = ,

      const int w = ) const {

    CHECK_GE(n, );

    CHECK_LE(n, num());

    CHECK_GE(channels(), );

    CHECK_LE(c, channels());

    CHECK_GE(height(), );

    CHECK_LE(h, height());

    CHECK_GE(width(), );

    CHECK_LE(w, width());

    return ((n * channels() + c) * height() + h) * width() + w;

  }

  inline int offset(const vector<int>& indices) const {

    CHECK_LE(indices.size(), num_axes());

    int offset = ;

    for (int i = ; i < num_axes(); ++i) {

      offset *= shape(i);

      if (indices.size() > i) {

        CHECK_GE(indices[i], );

        CHECK_LT(indices[i], shape(i));

        offset += indices[i];

      }

    }

    return offset;

  }

https://www.cnblogs.com/neopenx/p/5294682.html

forward_cpu_gemm、forward_cpu_bias函数的实现.is_1x1_用来判断是不是1x1的卷积操作,skip_im2col用来判断是不是需要把图片转换成矩阵

template <typename Dtype>

void BaseConvolutionLayer<Dtype>::forward_cpu_gemm(const Dtype* input,

    const Dtype* weights, Dtype* output, bool skip_im2col) {

  const Dtype* col_buff = input;

  if (!is_1x1_) {

    if (!skip_im2col) {

      conv_im2col_cpu(input, col_buffer_.mutable_cpu_data());

    }

    col_buff = col_buffer_.cpu_data();

  }

  for (int g = ; g < group_; ++g) {

    caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, conv_out_channels_ /

        group_, conv_out_spatial_dim_, kernel_dim_ / group_,

        (Dtype)., weights + weight_offset_ * g, col_buff + col_offset_ * g,

        (Dtype)., output + output_offset_ * g);

  }

}

template <typename Dtype>

void BaseConvolutionLayer<Dtype>::forward_cpu_bias(Dtype* output,

    const Dtype* bias) {

  caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, num_output_,

      height_out_ * width_out_, , (Dtype)., bias, bias_multiplier_.cpu_data(),

      (Dtype)., output);

}

conv_im2col_cpu

inline void conv_im2col_cpu(const Dtype* data, Dtype* col_buff) {

    im2col_cpu(data, , conv_in_channels_, conv_in_height_, conv_in_width_,

      kernel_h_, kernel_w_, pad_h_, pad_w_, stride_h_, stride_w_,

      , , col_buff);

  }

template <typename Dtype>

void im2col_cpu(const Dtype* data_im, const int channels,

    const int height, const int width, const int kernel_h, const int kernel_w,

    const int pad_h, const int pad_w,

    const int stride_h, const int stride_w,

    const int dilation_h, const int dilation_w,

    Dtype* data_col) {

  const int output_h = (height +  * pad_h -

    (dilation_h * (kernel_h - ) + )) / stride_h + ;

  const int output_w = (width +  * pad_w -

    (dilation_w * (kernel_w - ) + )) / stride_w + ;

  const int channel_size = height * width;

  for (int channel = channels; channel--; data_im += channel_size) {

    for (int kernel_row = ; kernel_row < kernel_h; kernel_row++) {

      for (int kernel_col = ; kernel_col < kernel_w; kernel_col++) {

        int input_row = -pad_h + kernel_row * dilation_h;

        for (int output_rows = output_h; output_rows; output_rows--) {

          if (!is_a_ge_zero_and_a_lt_b(input_row, height)) {

            for (int output_cols = output_w; output_cols; output_cols--) {

              *(data_col++) = ;

            }

          } else {

            int input_col = -pad_w + kernel_col * dilation_w;

            for (int output_col = output_w; output_col; output_col--) {

              if (is_a_ge_zero_and_a_lt_b(input_col, width)) {

                *(data_col++) = data_im[input_row * width + input_col];

              } else {

                *(data_col++) = ;

              }

              input_col += stride_w;

            }

          }

          input_row += stride_h;

        }

      }

    }

  }

}

caffe_cpu_gemm函数的实现.cblas_sgemm是cblas库的一个函数(# inclue<cblas.h>),

template<>

void caffe_cpu_gemm<float>(const CBLAS_TRANSPOSE TransA,

    const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,

    const float alpha, const float* A, const float* B, const float beta,

    float* C) {

  int lda = (TransA == CblasNoTrans) ? K : M;

  int ldb = (TransB == CblasNoTrans) ? N : K;

  cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,

      ldb, beta, C, N);

}

template<>

void caffe_cpu_gemm<double>(const CBLAS_TRANSPOSE TransA,

    const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,

    const double alpha, const double* A, const double* B, const double beta,

    double* C) {

  int lda = (TransA == CblasNoTrans) ? K : M;

  int ldb = (TransB == CblasNoTrans) ? N : K;

  cblas_dgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,

      ldb, beta, C, N);

}

template<>

void caffe_cpu_gemm<Half>(const CBLAS_TRANSPOSE TransA,

    const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,

    const Half alpha, const Half* A, const Half* B, const Half beta,

    Half* C) {

  DeviceMemPool& pool = Caffe::Get().cpu_mem_pool();

  float* fA = (float*)pool.Allocate(M*K*sizeof(float));

  float* fB = (float*)pool.Allocate(K*N*sizeof(float));

  float* fC = (float*)pool.Allocate(M*N*sizeof(float));

  halves2floats_cpu(A, fA, M*K);

  halves2floats_cpu(B, fB, K*N);

  halves2floats_cpu(C, fC, M*N);

  caffe_cpu_gemm<float>(TransA, TransB, M, N, K, float(alpha), fA, fB, float(beta), fC);

  floats2halves_cpu(fC, C, M*N);

  pool.Free((char*)fA);

  pool.Free((char*)fB);

  pool.Free((char*)fC);

}

caffe卷积层实现的更多相关文章

caffe 卷积层的运算
贾清扬寻找快速算法之路:https://github.com/Yangqing/caffe/wiki/Convolution-in-Caffe:-a-memo 卷积运算图文并茂:http://www. ...
caffe卷积层代码阅读笔记
卷积的实现思想: 通过im2col将image转为一个matrix,将卷积操作转为矩阵乘法运算通过调用GEMM完毕运算操作以下两个图是我在知乎中发现的,"盗"用一下,确实非常好 ...
caffe之（一）卷积层
在caffe中,网络的结构由prototxt文件中给出,由一些列的Layer(层)组成,常用的层如:数据加载层.卷积操作层.pooling层.非线性变换层.内积运算层.归一化层.损失计算层等:本篇主要 ...
caffe源码卷积层
通俗易懂理解卷积图示理解神经网络的卷积 input: 3 * 5 * 5 (c * h * w) pading: 1 步长: 2 卷积核: 2 * 3 * 3 * 3 ( n * c * k * k ...
TensorFlow与caffe中卷积层feature map大小计算
刚刚接触Tensorflow,由于是做图像处理,因此接触比较多的还是卷及神经网络,其中会涉及到在经过卷积层或者pooling层之后,图像Feature map的大小计算,之前一直以为是与caffe相同 ...
caffe Python API 之卷积层（Convolution）
1.Convolution层: 就是卷积层,是卷积神经网络(CNN)的核心层. 层类型:Convolution lr_mult: 学习率的系数,最终的学习率是这个数乘以solver.prototxt配 ...
【caffe】卷积层代码解析
1.Forward_cpu conv_layer.cpp template <typename Dtype> void ConvolutionLayer<Dtype>::For ...
caffe中卷积层和pooling层计算下一层的特征map的大小
pool层,其中ceil是向上取整函数卷积层:
caffe中全卷积层和全连接层训练参数如何确定
今天来仔细讲一下卷基层和全连接层训练参数个数如何确定的问题.我们以Mnist为例,首先贴出网络配置文件: name: "LeNet" layer { name: "mni ...

随机推荐

oracle 备份恢复篇(五)---rman 剩下控制文件和spfile
一,环境准备 ❤ 拥有全量备份文件
nginx 代理服务指令详解
nginx 正向代理与反向代理说明图超级形象说明. 正向代理指令: 1, resolver 这个用于DNS服务器的ip . DNS服务器的主要工作是进行域名解析,将域名映射为对应IP地址 resol ...
linux运维配置讲解--sshd-config
文件配置: 1, /etc/ssh/sshd_config ssh配置文件 2, /etc/shadow 密码文件 3, /etc/sudoers 授权用户管理文件 4, /etc/issue 系统信 ...
Angular4+NodeJs+MySQL 入门-04 接口调用类
上一篇文章说一下,后台接口的创建,这篇说一下如果调用接口. 创建一个目录helpers 此目录下有三个文件分别是 ApiClient.ts.clientMiddleware.ts.Core.ts,前面 ...
Docker的基本构架
不多说,直接上干货! Docker的基本构架 Docker基于Client-Server架构,Docker daemon是服务端,Docker client是客户端. Docker的基本架构,如下图所 ...
java多线程之线程组与线程池
看这篇文章:http://blog.csdn.net/zen99t/article/details/50909099
nyoj 104——最大和——————【子矩阵最大和】
最大和时间限制:1000 ms | 内存限制:65535 KB 难度:5 描述给定一个由整数组成二维矩阵(r*c),现在需要找出它的一个子矩阵,使得这个子矩阵内的所有元素之和最大,并把这个 ...
SQL 脚本整理笔记
1.视图存储过程触发器批量加密(With Encryption),单个解密在运行过程中自己找不到启用DAC 的地方,链接的时候需要在服务器名称前面添加ADMIN:,如本机是ADMIN:WP-P ...
mvc中在Action里调用另一个Action
今天做东西时发现一个新东西.即在一个Action调用另一Action.前提是同一个控制器.(没在一个控制里的没试过) 调用方法: public ActionResult Test1(){ //to ...
【转】 ASP.NET使用ICallbackEventHandler无刷新验证用户名是否可用
功能说明:当用户在用户名输入框输入字符并焦点离开此输入框时,自动到数据库用户表中验证此用户名是否已被注册,如果已被注册,显示[不可用],反之,显示[可用],期间页面不刷新,读者也可以考虑将提示文字换成 ...

caffe卷积层实现

caffe卷积层实现的更多相关文章

随机推荐

热门专题