What are the advantages of ReLU over sigmoid function in deep neural network?
The state of the art of non-linearity is to use ReLU instead of sigmoid function in deep neural network, what are the advantages?
I know that training a network when ReLU is used would be faster, and it is more biological inspired, what are the other advantages? (That is, any disadvantages of using sigmoid)?
Best answer in stackexchange:
Two additional major benefits of ReLUs are sparsity and a reduced likelihood of vanishing gradient. But first recall the definition of a ReLU is h=max(0,a)h=max(0,a) where a=Wx+ba=Wx+b.
One major benefit is the reduced likelihood of the gradient to vanish. This arises when a>0a>0. In this regime the gradient has a constant value. In contrast, the gradient of sigmoids becomes increasingly small as the absolute value of x increases. The constant gradient of ReLUs results in faster learning.
The other benefit of ReLUs is sparsity. Sparsity arises when a≤0a≤0. The more such units that exist in a layer the more sparse the resulting representation. Sigmoids on the other hand are always likely to generate some non-zero value resulting in dense representations. Sparse representations seem to be more beneficial than dense representations.
ReLU
ReLU的全称是rectified linear unit。上面的回答基本上涵盖了它胜过sigmoid function的几个方面:
- faster
 - more biological inspired
 - sparsity
 - less chance of vanishing gradient (梯度消失问题)
 
早期使用sigmoid或tanh激活函数的DL在做unsupervised learning时因为 gradient vanishing problem 的问题会无法收敛。ReLU则这没有这个问题。
What are the advantages of ReLU over sigmoid function in deep neural network?的更多相关文章
- Sigmoid function in NN
		
X = [ones(m, ) X]; temp = X * Theta1'; t = size(temp, ); temp = [ones(t, ) temp]; h = temp * Theta2' ...
 - S性能 Sigmoid Function or Logistic Function
		
S性能 Sigmoid Function or Logistic Function octave码 x = -10:0.1:10; y = zeros(length(x), 1); for i = 1 ...
 - logistic function 和 sigmoid function
		
简单说, 只要曲线是 “S”形的函数都是sigmoid function: 满足公式<1>的形式的函数都是logistic function. 两者的相同点是: 函数曲线都是“S”形. ...
 - Sigmoid Function
		
本系列文章由 @yhl_leo 出品,转载请注明出处. 文章链接: http://blog.csdn.net/yhl_leo/article/details/51734189 Sigmodi 函数是一 ...
 - sigmoid function vs softmax function
		
DIFFERENCE BETWEEN SOFTMAX FUNCTION AND SIGMOID FUNCTION 二者主要的区别见于, softmax 用于多分类,sigmoid 则主要用于二分类: ...
 - sigmoid function的直观解释
		
Sigmoid function也叫Logistic function, 在logistic regression中扮演将回归估计值h(x)从 [-inf, inf]映射到[0,1]的角色. 公式为: ...
 - 神经网络中的激活函数具体是什么?为什么ReLu要好过于tanh和sigmoid function?(转)
		
为什么引入激活函数? 如果不用激励函数(其实相当于激励函数是f(x) = x),在这种情况下你每一层输出都是上层输入的线性函数,很容易验证,无论你神经网络有多少层,输出都是输入的线性组合,与没有隐藏层 ...
 - ReLU 和sigmoid 函数对比
		
详细对比请查看:http://www.zhihu.com/question/29021768/answer/43517930 . 激活函数的作用: 是为了增加神经网络模型的非线性.否则你想想,没有激活 ...
 - 小白学习之pytorch框架(5)-多层感知机(MLP)-(tensor、variable、计算图、ReLU()、sigmoid()、tanh())
		
先记录一下一开始学习torch时未曾记录(也未好好弄懂哈)导致又忘记了的tensor.variable.计算图 计算图 计算图直白的来说,就是数学公式(也叫模型)用图表示,这个图即计算图.借用 htt ...
 
随机推荐
- centos桌面使用
			
firefox添加flash插件 [root@bogon home]# cp libflashplayer.so /usr/lib64/mozilla/pl pl plugins/ plugins-w ...
 - 400 Bad Request(angluarJs)
			
今天做一个编辑的功能的时候,像后台传递一个实体,结果报400 Bad Request的错误....找了好久也没发现错误,老是报(不支持GET方式提交),检查好多遍我都是用的POST...不知道问题出在 ...
 - python - ConfigParser
			
ConfigParse 作用是解析配置文件. 配置文件格式如下 [test1]num1: 10[test2]num1: 35 配置文件两个概念section和option, 在这个例子中第一个sect ...
 - embedded tomcat context.xml
			
在网络下载相关的embedded tomcat jar.也可直接在maven中检索. 在main方法中,输入以下代码: //新建tomcat实例 Tomcat tomcat = new Tomcat( ...
 - js鲸鱼
			
<!DOCTYPE html> <html lang="en" xmlns="http://www.w3.org/1999/xhtml"> ...
 - csrf 跨站请求伪造
			
转自 http://www.cnblogs.com/hyddd/archive/2009/04/09/1432744.html
 - cocos2dx 3.x(场景(层)的生命周期)
			
//进入当前层初第一步始化层调用 bool GameScence::init() { if( !void init() ) { returnfalse; } log("进入当前层初第一步始化 ...
 - 【Apache】apache简单配置URL重写规则
			
[概述]URL重写就是首先获得一个进入的URL请求然后把它重新写成网站可以处理的另一个URL的过程.举个例子来说,如果通过浏览器进来的URL是index.php?type=news&& ...
 - adadmin: error while loading shared libraries: libclntsh.so.10.1
			
EBS R12.2运行adadmin报错: $ adadmin adadmin: error while loading shared libraries: libclntsh.so.10.1: ca ...
 - Leetcode: Sort Characters By Frequency
			
Given a string, sort it in decreasing order based on the frequency of characters. Example 1: Input: ...