caffe中的sgd,与激活函数(activation function)
caffe中activation function的形式,直接决定了其训练速度以及SGD的求解。
在caffe中,不同的activation function对应的sgd的方式是不同的,因此,在配置文件中指定activation layer的type,目前caffe中用的最多的是relu的activation function.
caffe中,目前实现的activation function有以下几种:
absval, bnll, power, relu, sigmoid, tanh等几种,分别有单独的layer层。其数学公式分别为:
算了,这部分我不解释了,直接看caffe的tutorial吧
ReLU / Rectified-Linear and Leaky-ReLU
- LayerType:
RELU - CPU implementation:
./src/caffe/layers/relu_layer.cpp - CUDA GPU implementation:
./src/caffe/layers/relu_layer.cu - Parameters (
ReLUParameter relu_param)- Optional
negative_slope[default 0]: specifies whether to leak the negative part by multiplying it with the slope value rather than setting it to 0.
- Optional
Sample (as seen in
./examples/imagenet/imagenet_train_val.prototxt)layers {
name: "relu1"
type: RELU
bottom: "conv1"
top: "conv1"
}
Given an input value x, The RELU layer computes the output as x if x > 0 and negative_slope * x if x <= 0. When the negative slope parameter is not set, it is equivalent to the standard ReLU function of taking max(x, 0). It also supports in-place computation, meaning that the bottom and the top blob could be the same to preserve memory consumption.
Sigmoid
- LayerType:
SIGMOID - CPU implementation:
./src/caffe/layers/sigmoid_layer.cpp - CUDA GPU implementation:
./src/caffe/layers/sigmoid_layer.cu Sample (as seen in
./examples/imagenet/mnist_autoencoder.prototxt)layers {
name: "encode1neuron"
bottom: "encode1"
top: "encode1neuron"
type: SIGMOID
}
The SIGMOID layer computes the output as sigmoid(x) for each input element x.
TanH / Hyperbolic Tangent
- LayerType:
TANH - CPU implementation:
./src/caffe/layers/tanh_layer.cpp - CUDA GPU implementation:
./src/caffe/layers/tanh_layer.cu Sample
layers {
name: "layer"
bottom: "in"
top: "out"
type: TANH
}
The TANH layer computes the output as tanh(x) for each input element x.
Absolute Value
- LayerType:
ABSVAL - CPU implementation:
./src/caffe/layers/absval_layer.cpp - CUDA GPU implementation:
./src/caffe/layers/absval_layer.cu Sample
layers {
name: "layer"
bottom: "in"
top: "out"
type: ABSVAL
}
The ABSVAL layer computes the output as abs(x) for each input element x.
Power
- LayerType:
POWER - CPU implementation:
./src/caffe/layers/power_layer.cpp - CUDA GPU implementation:
./src/caffe/layers/power_layer.cu - Parameters (
PowerParameter power_param)- Optional
power[default 1]scale[default 1]shift[default 0]
- Optional
Sample
layers {
name: "layer"
bottom: "in"
top: "out"
type: POWER
power_param {
power: 1
scale: 1
shift: 0
}
}
The POWER layer computes the output as (shift + scale * x) ^ power for each input element x.
BNLL
- LayerType:
BNLL - CPU implementation:
./src/caffe/layers/bnll_layer.cpp - CUDA GPU implementation:
./src/caffe/layers/bnll_layer.cu Sample
layers {
name: "layer"
bottom: "in"
top: "out"
type: BNLL
}
The BNLL (binomial normal log likelihood) layer computes the output as log(1 + exp(x)) for each input element x.
caffe中的sgd,与激活函数(activation function)的更多相关文章
- 激活函数-Activation Function
该博客的内容是莫烦大神的授课内容.在此只做学习记录作用. 原文连接:https://morvanzhou.github.io/tutorials/machine-learning/tensorflow ...
- 浅谈深度学习中的激活函数 - The Activation Function in Deep Learning
原文地址:http://www.cnblogs.com/rgvb178/p/6055213.html版权声明:本文为博主原创文章,未经博主允许不得转载. 激活函数的作用 首先,激活函数不是真的要去激活 ...
- The Activation Function in Deep Learning 浅谈深度学习中的激活函数
原文地址:http://www.cnblogs.com/rgvb178/p/6055213.html 版权声明:本文为博主原创文章,未经博主允许不得转载. 激活函数的作用 首先,激活函数不是真的要去激 ...
- 《Noisy Activation Function》噪声激活函数(一)
本系列文章由 @yhl_leo 出品,转载请注明出处. 文章链接: http://blog.csdn.net/yhl_leo/article/details/51736830 Noisy Activa ...
- MXNet 定义新激活函数(Custom new activation function)
https://blog.csdn.net/weixin_34260991/article/details/87106463 这里使用比较简单的定义方式,只是在原有的激活函数调用中加入. 准备工作下载 ...
- 激活函数:Swish: a Self-Gated Activation Function
今天看到google brain 关于激活函数在2017年提出了一个新的Swish 激活函数. 叫swish,地址:https://arxiv.org/abs/1710.05941v1 pytorch ...
- TensorFlow Activation Function 1
部分转自:https://blog.csdn.net/caicaiatnbu/article/details/72745156 激活函数(Activation Function)运行时激活神经网络中某 ...
- caffe中各层的作用:
关于caffe中的solver: cafffe中的sover的方法都有: Stochastic Gradient Descent (type: "SGD"), AdaDelta ( ...
- ML 激励函数 Activation Function (整理)
本文为内容整理,原文请看url链接,感谢几位博主知识来源 一.什么是激励函数 激励函数一般用于神经网络的层与层之间,上一层的输出通过激励函数的转换之后输入到下一层中.神经网络模型是非线性的,如果没有使 ...
随机推荐
- Dotfuscator 的使用方法
打开Dotfuscator工具,选择“Create New Project” 在Input选项中选择需要混淆的文件 把 Disable String Encryption 设为 NO,即启用字符串加密 ...
- Hidden Markov Models笔记
Andrew Ng CS229 讲义: https://pan.baidu.com/s/12zMYBY1NLzkluHNeMNO6MQ HMM模型常用于NLP.语音等领域. 马尔科夫模型(Markov ...
- RTT之AT命令组件
包含客户端和服务器:用于GPRS和3G的通讯命令格式.四种基本功能 测试功能:AT+<x>=? 用于查询命令参数格式及取值范围: 查询功能:AT+<x>? 用于返回命令参数当前 ...
- [转]How to Create Custom Filters in AngularJs
本文转自:http://www.codeproject.com/Tips/829025/How-to-Create-Custom-Filters-in-AngularJs Introduction F ...
- (转)blkid命令 获取文件系统类型、UUID
blkid命令 获取文件系统类型.UUID 原文:http://www.cnblogs.com/dkblog/archive/2011/08/30/2159630.html 在Linux下可以使用b ...
- http学习笔记(二):URL和资源
2.1浏览因特网资源 ------------总结:方案.主机.路径 方案:http,FTP,SMTP等 http(超文本传输协议)是一个基于请求与响应模式的.无状态的.应用层的协议,常基于TCP的连 ...
- Ubuntu14.04-PXE搭建
什么是PXE? PXE(Pre-boot Execution Environment,预启动执行环境)是由Intel公司开发的最新技术,工作于Client/Server的网络模式,支持工作站通过网络从 ...
- EF Core 2.1 +数据库视图
1.参考文档 https://stackoverflow.com/questions/36012616/working-with-sql-views-in-entity-framework-core ...
- tomcat多站点部署
我们可能会有这种场景,一个tomcat想部署两个web工程,说白了就是公用一个端口,那怎么办呢?就是多站点部署,具体步骤如下(这里以linux平台举例): 1)先修改server.xml(conf/s ...
- java分页三个类 PageBean ResponseUtil StringUtil
package ssmy.page; /** * 分页类 * @author Jesse * */public class PageBean { private int page;//第几页 priv ...