Caffe之layer_factory
之前在测试NN中各个层的时间的时候,遇到一个非常奇怪的问题,分别使用Caffe自己的gpu方法和cuDNN方法,在卷积上性能差异非常大,但是在pooling层上基本没有变化。抽空检查了代码之后,发现是layer_factory模式导致的问题。下面就以下几个方面来进行
1.工厂模式
2.layer_factory详解
3.layer_factory中坑
4.问题影响分析
1.工厂模式
工厂模式是设计模式中的一种,面向的业务大概是在编码时不能预见需要创建那种类的实例,系统不依赖产品类如何被创建、组合和表达的细节,工厂模式的弊端是扩展比较少的项目中比较合适。
工厂模式有三种角色:
工厂类角色:根据逻辑产生具体的产品
抽象产品角色:具体产品的父类,一把由Java中的接口或者C++中的抽象类来实现
具体产品角色:产品实例
2.layer_factory详解
众所周知,Caffe1.0版本中,目前有三大类算子:CPU版本、Caffe自己实现的CUDA版本的和CuDNN版本的。layer_factory文件负责组装Caffe中算子,工厂模式的意思就是根据用户的设置,在执行时,选择相应版本的算子进行。
以下参考至http://zhuanlan.zhihu.com/hacker-and-painter/20456649
layer_factory.hpp是layer_factory的头文件
- /**
- * @brief A layer factory that allows one to register layers.
- * During runtime, registered layers could be called by passing a LayerParameter
- * protobuffer to the CreateLayer function:
- *
- * LayerRegistry<Dtype>::CreateLayer(param);
- *
- * There are two ways to register a layer. Assuming that we have a layer like:
- *
- * template <typename Dtype>
- * class MyAwesomeLayer : public Layer<Dtype> {
- * // your implementations
- * };
- *
- * and its type is its C++ class name, but without the "Layer" at the end
- * ("MyAwesomeLayer" -> "MyAwesome").
- *
- * If the layer is going to be created simply by its constructor, in your c++
- * file, add the following line:
- *
- * REGISTER_LAYER_CLASS(MyAwesome);
- *
- * Or, if the layer is going to be created by another creator function, in the
- * format of:
- *
- * template <typename Dtype>
- * Layer<Dtype*> GetMyAwesomeLayer(const LayerParameter& param) {
- * // your implementation
- * }
- *
- * (for example, when your layer has multiple backends, see GetConvolutionLayer
- * for a use case), then you can register the creator function instead, like
- *
- * REGISTER_LAYER_CREATOR(MyAwesome, GetMyAwesomeLayer)
- *
- * Note that each layer type should only be registered once.
- */
- #ifndef CAFFE_LAYER_FACTORY_H_
- #define CAFFE_LAYER_FACTORY_H_
- #include <map>
- #include <string>
- #include "caffe/common.hpp"
- #include "caffe/proto/caffe.pb.h"
- namespace caffe {
- template <typename Dtype>
- class Layer;
- //LayerResistry的功能很简单,就是将类和对应的字符串类型放入到一个map当中去,以便灵活调用。主要就是注册类的功能
- template <typename Dtype>
- class LayerRegistry {
- public:
- // 函数指针Creator,返回的是Layer<Dtype>类型的指针
- typedef shared_ptr<Layer<Dtype> > (*Creator)(const LayerParameter&);
- // CreatorRegistry是字符串与对应的Creator的映射
- typedef std::map<string, Creator> CreatorRegistry;
- static CreatorRegistry& Registry() {
- static CreatorRegistry* g_registry_ = new CreatorRegistry();
- return *g_registry_;
- }
- // Adds a creator.
- // 根据类型和函数指针,加入到表中
- static void AddCreator(const string& type, Creator creator) {
- CreatorRegistry& registry = Registry();
- CHECK_EQ(registry.count(type), )
- << "Layer type " << type << " already registered.";
- registry[type] = creator;
- }
- // Get a layer using a LayerParameter.
- //给定层的类型,创建层
- static shared_ptr<Layer<Dtype> > CreateLayer(const LayerParameter& param) {
- LOG(INFO) << "Creating layer " << param.name();
- // 从参数中获得类型字符串
- const string& type = param.type();
- // 检查是否查找到给定type的Creator
- CreatorRegistry& registry = Registry();
- CHECK_EQ(registry.count(type), ) << "Unknown layer type: " << type
- << " (known types: " << LayerTypeList() << ")";
- // 调用对应的层的Creator函数
- return registry[type](param);
- }
- private:
- // Layer registry should never be instantiated - everything is done with its
- // static variables.
- // 禁止实例化,因为该类都是静态函数,所以是私有的
- LayerRegistry() {}
- //返回层的类型列表
- static string LayerTypeList() {
- // 获得注册表
- CreatorRegistry& registry = Registry();
- string layer_types;
- // 遍历注册表压入layer_types字符串容器
- for (typename CreatorRegistry::iterator iter = registry.begin();
- iter != registry.end(); ++iter) {
- if (iter != registry.begin()) {
- layer_types += ", ";
- }
- layer_types += iter->first;
- }
- return layer_types;
- }
- };
- // LayerRegisterer
- // 自己定义层的注册器
- // 以供后面的宏进行使用
- template <typename Dtype>
- class LayerRegisterer {
- public:
- // 层的注册器的构造函数
- LayerRegisterer(const string& type,
- shared_ptr<Layer<Dtype> > (*creator)(const LayerParameter&)) {
- // LOG(INFO) << "Registering layer type: " << type;
- // 还是调用的层注册表中的加入Creator函数加入注册表
- LayerRegistry<Dtype>::AddCreator(type, creator);
- }
- };
- //为了方便作者还弄了个宏便于注册自己写的层类
- // 生成g_creator_f_type(type, creator<Dtype>)的两个函数 (double和float类型)
- #define REGISTER_LAYER_CREATOR(type, creator) \
- static LayerRegisterer<float> g_creator_f_##type(#type, creator<float>); \
- static LayerRegisterer<double> g_creator_d_##type(#type, creator<double>) \
- /* 注册自己定义的类,类名为type,
- 假设比如type=bias,那么生成如下的代码
- 下面的函数直接调用你自己的类的构造函数生成一个类的实例并返回
- CreatorbiasLayer(const LayerParameter& param)
- 下面的语句是为你自己的类定义了LayerRegisterer<float>类型的静态变量g_creator_f_biasLayer(float类型,实际上就是把你自己的类的字符串类型和类的实例绑定到注册表)
- static LayerRegisterer<float> g_creator_f_biasLayer(bias, CreatorbiasLayer)
- 下面的语句为你自己的类定义了LayerRegisterer<double>类型的静态变量g_creator_d_biasLayer(double类型,实际上就是把你自己的类的字符串类型和类的实例绑定到注册表)
- static LayerRegisterer<double> g_creator_d_biasLayer(bias, CreatorbiasLayer)
- */
- #define REGISTER_LAYER_CLASS(type) \
- template <typename Dtype> \
- shared_ptr<Layer<Dtype> > Creator_##type##Layer(const LayerParameter& param) \
- { \
- return shared_ptr<Layer<Dtype> >(new type##Layer<Dtype>(param)); \
- } \
- REGISTER_LAYER_CREATOR(type, Creator_##type##Layer)
- } // namespace caffe
- #endif // CAFFE_LAYER_FACTORY_H_
经过上边的阐述之后,实现部分(这部分和1.0版本有出入,大的方面不影响)
layer_factory.hpp:
- // Make sure we include Python.h before any system header
- // to avoid _POSIX_C_SOURCE redefinition
- #ifdef WITH_PYTHON_LAYER
- #include <boost/python.hpp>
- #endif
- #include <string>
- #include "caffe/layer.hpp"
- #include "caffe/layer_factory.hpp"
- #include "caffe/proto/caffe.pb.h"
- #include "caffe/vision_layers.hpp"
- #ifdef WITH_PYTHON_LAYER
- #include "caffe/python_layer.hpp"
- #endif
- namespace caffe {
- // 写一个获取卷积层实例的函数
- // Get convolution layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetConvolutionLayer(
- const LayerParameter& param) {
- // 从参数中获取是使用什么引擎进行计算CUDNN还是CAFFE还是DEFAULT
- // engine可从caffe.proto中看出是枚举类型的
- ConvolutionParameter_Engine engine = param.convolution_param().engine();
- if (engine == ConvolutionParameter_Engine_DEFAULT) {
- engine = ConvolutionParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = ConvolutionParameter_Engine_CUDNN;
- #endif
- }
- if (engine == ConvolutionParameter_Engine_CAFFE) {
- // 直接初始化Caffe的卷积层
- return shared_ptr<Layer<Dtype> >(new ConvolutionLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == ConvolutionParameter_Engine_CUDNN) {
- // 初始化CUDNN的卷积层
- return shared_ptr<Layer<Dtype> >(new CuDNNConvolutionLayer<Dtype>(param));
- #endif
- } else {// 否则就是出错了
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- // 注册该卷积层,类型名为Convolution,获取卷积层的实例为GetConvolutionLayer函数
- REGISTER_LAYER_CREATOR(Convolution, GetConvolutionLayer);
- // 获取池化层的实例,同卷积层的逻辑
- // Get pooling layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetPoolingLayer(const LayerParameter& param) {
- PoolingParameter_Engine engine = param.pooling_param().engine();
- if (engine == PoolingParameter_Engine_DEFAULT) {
- engine = PoolingParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = PoolingParameter_Engine_CUDNN;
- #endif
- }
- if (engine == PoolingParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == PoolingParameter_Engine_CUDNN) {
- PoolingParameter p_param = param.pooling_param();
- if (p_param.pad() || p_param.pad_h() || p_param.pad_w() ||
- param.top_size() > ) {
- LOG(INFO) << "CUDNN does not support padding or multiple tops. "
- << "Using Caffe's own pooling layer.";
- return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
- }
- return shared_ptr<Layer<Dtype> >(new CuDNNPoolingLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- // 注册池化层
- REGISTER_LAYER_CREATOR(Pooling, GetPoolingLayer);
- // 注册ReLU层
- // Get relu layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetReLULayer(const LayerParameter& param) {
- ReLUParameter_Engine engine = param.relu_param().engine();
- if (engine == ReLUParameter_Engine_DEFAULT) {
- engine = ReLUParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = ReLUParameter_Engine_CUDNN;
- #endif
- }
- if (engine == ReLUParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new ReLULayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == ReLUParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNReLULayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(ReLU, GetReLULayer);
- // 注册sigmoid层
- // Get sigmoid layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetSigmoidLayer(const LayerParameter& param) {
- SigmoidParameter_Engine engine = param.sigmoid_param().engine();
- if (engine == SigmoidParameter_Engine_DEFAULT) {
- engine = SigmoidParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = SigmoidParameter_Engine_CUDNN;
- #endif
- }
- if (engine == SigmoidParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new SigmoidLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == SigmoidParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNSigmoidLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(Sigmoid, GetSigmoidLayer);
- // 注册softmax层
- // Get softmax layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetSoftmaxLayer(const LayerParameter& param) {
- SoftmaxParameter_Engine engine = param.softmax_param().engine();
- if (engine == SoftmaxParameter_Engine_DEFAULT) {
- engine = SoftmaxParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = SoftmaxParameter_Engine_CUDNN;
- #endif
- }
- if (engine == SoftmaxParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new SoftmaxLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == SoftmaxParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNSoftmaxLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(Softmax, GetSoftmaxLayer);
- // 注册tanh层
- // Get tanh layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetTanHLayer(const LayerParameter& param) {
- TanHParameter_Engine engine = param.tanh_param().engine();
- if (engine == TanHParameter_Engine_DEFAULT) {
- engine = TanHParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = TanHParameter_Engine_CUDNN;
- #endif
- }
- if (engine == TanHParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new TanHLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == TanHParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNTanHLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(TanH, GetTanHLayer);
- // 注册PYTHON层
- #ifdef WITH_PYTHON_LAYER
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetPythonLayer(const LayerParameter& param) {
- Py_Initialize();
- try {
- bp::object module = bp::import(param.python_param().module().c_str());
- bp::object layer = module.attr(param.python_param().layer().c_str())(param);
- return bp::extract<shared_ptr<PythonLayer<Dtype> > >(layer)();
- } catch (bp::error_already_set) {
- PyErr_Print();
- throw;
- }
- }
- REGISTER_LAYER_CREATOR(Python, GetPythonLayer);
- #endif
- // Layers that use their constructor as their default creator should be
- // registered in their corresponding cpp files. Do not register them here.
- } // namespace caffe
3.layer_factory中坑
在现有的代码中,Pooling层的注册部分出现了这个代码:
- // CuDNN assumes layers are not being modified in place, thus
- // breaking our index tracking for updates in some cases in Caffe.
- // Until there is a workaround in Caffe (index management) or
- // cuDNN, use Caffe layer to max pooling, or don't use in place
- // layers after max pooling layers
- if (param.pooling_param().pool() == PoolingParameter_PoolMethod_MAX) {
- return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
- } else {
- return shared_ptr<Layer<Dtype> >(new CuDNNPoolingLayer<Dtype>(param));
- }
这就直接导致,只要你用的是MaxPool,使用的一定是Caffe自己实现的cu代码,永远无法使用cuDNN版本的代码,这就解释了我们之前测试MaxPool层性能一直没有变化的原因
4.问题影响分析
但是caffe的作者为什么不使用cuDNN的MaxPool呢,经过查询NVIDIA cuDNN的User Manual,我们发现,
4.144. cudnnPoolingForward
- cudnnStatus_t cudnnPoolingForward(
- cudnnHandle_t handle,
- const cudnnPoolingDescriptor_t poolingDesc,
- const void *alpha,
- const cudnnTensorDescriptor_t xDesc,
- const void *x,
- const void *beta,
- const cudnnTensorDescriptor_t yDesc,
- void *y)
This function computes pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.
Parameters
- handle
-
Input. Handle to a previously created cuDNN context.
- poolingDesc
-
Input. Handle to a previously initialized pooling descriptor.
- alpha, beta
-
Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue. Refer to this section for additional details.
- xDesc
-
Input. Handle to the previously initialized input tensor descriptor. Must be of type FLOAT, or DOUBLE, or HALF, or INT8. See cudnnDataType_t.
- x
-
Input. Data pointer to GPU memory associated with the tensor descriptorxDesc.
- yDesc
-
Input. Handle to the previously initialized output tensor descriptor. Must be of type FLOAT, or DOUBLE, or HALF, or INT8. See cudnnDataType_t.
- y
-
Output. Data pointer to GPU memory associated with the output tensor descriptoryDesc.
The possible error values returned by this function and their meanings are listed below.
Returns
- CUDNN_STATUS_SUCCESS
-
The function launched successfully.
- CUDNN_STATUS_BAD_PARAM
-
At least one of the following conditions are met:
- The dimensionsn,cof the input tensor and output tensors differ.
- Thedatatypeof the input tensor and output tensors differs.
- CUDNN_STATUS_NOT_SUPPORTED
-
The function does not support the provided configuration. See the following for some examples of non-supported configurations:
- ThewStrideof input tensor or output tensor is not 1.
- CUDNN_STATUS_EXECUTION_FAILED
-
The function failed to launch on the GPU
这个地方比较神奇的是只能传入两个参数,这就无法实现mask的更新,不太明白cuDNN设计者的思路,目前看,这个地方要想保持正确性,暂时应该是无法使用cuDNN的PoolingForward了。
Caffe之layer_factory的更多相关文章
- 基于Caffe的DeepID2实现(中)
小喵的唠叨话:我们在上一篇博客里面,介绍了Caffe的Data层的编写.有了Data层,下一步则是如何去使用生成好的训练数据.也就是这一篇的内容. 小喵的博客:http://www.miaoerduo ...
- 浅析py-faster-rcnn中不同版本caffe的安装及其对应不同版本cudnn的解决方案
浅析py-faster-rcnn中不同版本caffe的安装及其对应不同版本cudnn的解决方案 本文是截止目前为止最强攻略,按照本文方法基本可以无压力应对caffe和Ross B. Girshick的 ...
- 【caffe】mnist训练日志
@tags caffe 前面根据train_lenet.sh改写了train_lenet.py后,在根目录下执行它,得到一系列输出,内容如下: I1013 10:05:16.721294 1684 c ...
- 在caffe中添加新的layer
比如现在要添加一个vision layer,名字叫Ly_Layer:(一般命名第一个字母大写,其余小写.) 1.属于哪个类型的layer(共五种:common_layer, data_layer, l ...
- caffe: compile error: Could not open or find file your path~~/resized_data/0 and a total of 2 images .
I0219 14:48:40.965386 31108 net.cpp:76] Memory required for data: 0I0219 14:48:40.965517 31108 layer ...
- [caffe]深度学习之图像分类模型VGG解读
一.简单介绍 vgg和googlenet是2014年imagenet竞赛的双雄,这两类模型结构有一个共同特点是go deeper.跟googlenet不同的是.vgg继承了lenet以及alexnet ...
- caffe+GPU︱AWS.G2+Ubuntu14.04+GPU+CUDA8.0+cudnn8.0
国服亚马逊的GPU实例G2.2xlarge的python+caffe的安装过程,被虐- 一周才装出来- BVLC/caffe的在AWS安装的官方教程github: https://github.com ...
- caffe项目工程化封装FRCNN
各种坑!!想要做好,一定要自己一步步试,下载别人的总会出现各种问题. 步骤如下:(可以把这些文件打包在一个文件加下,分两个文件libs,include,一定要是自己的文件) 1 首先是配置caffe的 ...
- caffe中使用python定义新的层
转载链接:http://withwsf.github.io/2016/04/14/Caffe-with-Python-Layer/ Caffe通过Boost中的Boost.Python模块来支持使用P ...
随机推荐
- 从零搭建配置Cuckoo Sandbox
1.安装依赖 $ sudo apt-get install git mongodb libffi-dev build-essential python-django python python-dev ...
- c++ string构造函数学习
#include <iostream>#include <string> using namespace std; int main(){ string a1; cout &l ...
- 按下F2编辑dxDBTreeView的节点
在TdxDBTreeView控件的OnKeyDown事件中写入if Key = VK_F2 thenbegin if DBTreeMain.DBSelected = nil then Exit ...
- InfluxDB入门教程
前言InfluxDB是一个时序性数据库,详细资料如下http://liubin.org/blog/2016/02/18/tsdb-intro/ 下载和安装LZ从官网下载的是influxdb-1.2.4 ...
- vue定时器
mounted(){ setInterval(this.getasks,1000 * 120); },
- godot新手教程2[godot常用代码用法]
Godot概念: 在godot内,使用的语言是GDScript,大部分代码风格是和python一样. 在GDScript内代码段结束是换到下一行即可,不需要也不能添加”;”号,(注意:代码段结束后不能 ...
- 一步步分析Java深拷贝的两种方式-clone和序列化
今天遇到一道面试题,询问深拷贝的两种方法.主要就是clone方法和序列化方法.今天就来分析一下这两种方式如何实现深拷贝.如果想跳过解析的朋友,直奔"重点来了!"寻找答案. clon ...
- 34.TCP非阻塞连接及套接字异常处理丶端口扫描脚本
TCP非阻塞及套接字异常处理: TCP阻塞套接字异常捕获: 套接字创建失败,8000 socket.error 客户端连接错误: ConnectionRefusedError socket.gaier ...
- Orderly Class
题目链接: https://nanti.jisuanke.com/t/40449 题目大意:给出两个长度相同的不同字符串A, B.可以对A的任意长度区间进行一次翻转,问有多少种方法可以使得翻转后两字符 ...
- centos7服务搭建常用服务配置之一:SSH
目录 1 SSH服务协议 1.1 ssh服务协议说明 1.2 ssh服务工作机制 1.3 ssh加密技术说明 1.3.1 ssh实现安全链接建立,利用要是和锁头 1.3.2 ssh加密算法 1.4 s ...