TensorFlow激活函数+归一化-函数

激活函数的作用如下-引用《TensorFlow实践》：

这些函数与其他层的输出联合使用可以生成特征图。他们用于对某些运算的结果进行平滑或者微分。其目标是为神经网络引入非线性。曲线能够刻画出输入的复杂的变化。TensorFlow提供了多种激活函数，在CNN中一般使用tf.nn.relu的原因是因为，尽管relu会导致一些信息的损失，但是性能突出。在刚开始设计模型时，都可以采用relu的激活函数。高级用户也可以自己创建自己的激活函数，评价激活函数是否有用的主要因素参看如下几点：

1）该函数是单调的，随着输入的增加增加减小减小，从而利用梯度下降法找到局部极值点成为可能。

2）该函数是可微分的，以保证函数定义域内的任意一点上导数都存在，从而使得梯度下降法能够正常使用来自这类激活函数的输出。

常见的TensorFlow提供的激活函数如下：(详细请参考http://www.tensorfly.cn/tfdoc/api_docs/python/nn.html)

1.tf.nn.relu(features, name=None)

Computes rectified linear: max(features, 0).

features: A Tensor. Must be one of the following types: float32, float64, int32, int64,uint8, int16, int8.
name: A name for the operation (optional).

注：

优点在于不受‘梯度消失’的影响，取值范围为[0，+∞]。

缺点在于当使用了较大的学习速率时，易受到饱和的神经元的影响。

2.tf.nn.relu6(features, name=None)

Computes Rectified Linear 6: min(max(features, 0), 6).

features: A Tensor with type float, double, int32, int64, uint8, int16, or int8.
name: A name for the operation (optional).

3.tf.sigmoid(x, name=None)

Computes sigmoid of x element-wise.

Specifically, y = 1 / (1 + exp(-x)).

x: A Tensor with type float, double, int32, complex64, int64, or qint32.
name: A name for the operation (optional).

注：

优点在于sigmoid函数在样本训练的神经网络中可以将输出保持在[0.0,1.0]内部的能力非常有用。

缺点在于当输出接近饱和或剧烈变化时，对输出范围的这种缩减往往会带来一些不利影响。

4.tf.nn.softplus(features, name=None)

Computes softplus: log(exp(features) + 1).

features: A Tensor. Must be one of the following types: float32, float64, int32, int64,uint8, int16, int8.
name: A name for the operation (optional).

5.tf.tanh(x, name=None)

Computes hyperbolic tangent of x element-wise.

x: A Tensor with type float, double, int32, complex64, int64, or qint32.
name: A name for the operation (optional).

注：

优点在于双曲正切函数和sigmoid函数比较相似，tanh拥有sigmoid的优点，用时tanh具有输出负值的能力，tanh的值域为[-1.0,1.0].

MATLAB代码来体现函数的类型

clear all

close all

clc

% ACTVE FUNCTION %

X = linspace(-5,5,100);

plot(X)

title('feature = X')

% tf.nn.relu(features, name=None):max(features, 0) %

Y_relu = max(X,0);

figure,plot(Y_relu)

title('tf.nn.relu(features, name=None)')

% tf.nn.relu6(features, name=None):min(max(features, 0), 6) %

Y_relu6 = min(max(X,0),6);

figure,plot(Y_relu6)

title('tf.nn.relu6(features, name=None)')

% tf.sigmoid(x, name=None):y = 1 / (1 + exp(-x))%

Y_sigmoid = 1./(1+exp(-1.*X));

figure,plot(Y_sigmoid)

title('tf.sigmoid(x, name=None)')

% tf.nn.softplus(features, name=None):log(exp(features) + 1) %

Y_softplus = log(exp(X) + 1);

figure,plot(Y_softplus)

title('tf.nn.softplus(features, name=None)')

% tf.tanh(x, name=None):tanh(features) %

Y_tanh = tanh(X);

figure,plot(Y_tanh)

title('tf.tanh(x, name=None)')

X=feature tf.nn.relu(features, name=None)

tf.nn.relu6(features, name=None) tf.sigmoid(x, name=None)

tf.nn.softplus(features, name=None) tf.tanh(x, name=None)

归一化函数的重要作用-引用《TensorFlow实践》：

归一化层并非CNN所独有。在使用tf.nn.relu时，考虑输出的归一化是有价值的（详细参看http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf）。由于relu是无界函数，利用某些形式的归一化来识别哪些高频特征通常是十分有用的。local response normalization最早是由Krizhevsky和Hinton在关于ImageNet的论文里面使用的一种数据标准化方法，即使现在，也依然会有不少CNN网络会使用到这种正则手段。

`tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)`

Local Response Normalization.

The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,

sqr_sum[a, b, c, d] =

    sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2)

output = input / (bias + alpha * sqr_sum ** beta)

第一个参数input：这个输入就是feature map了，既然是feature map，那么它就具有[batch, height, width, channels]这样的shape
第二个参数depth_radius：这个值需要自己指定，就是上述公式中的n/2
第三个参数bias：上述公式中的k
第四个参数alpha：上述公式中的α
第五个参数beta：上述公式中的β
第六个参数name：上述操作的名称
返回值是新的feature map，它应该具有和原feature map相同的shape

以上是这种归一手段的公式，其中a的上标指该层的第几个feature map，a的下标x，y表示feature map的像素位置，N指feature map的总数量，公式里的其它参数都是超参，需要自己指定的。

这种方法是受到神经科学的启发，激活的神经元会抑制其邻近神经元的活动（侧抑制现象），至于为什么使用这种正则手段，以及它为什么有效，查阅了很多文献似乎也没有详细的解释，可

能是由于后来提出的batch normalization手段太过火热，渐渐的就把local response normalization掩盖了吧。

import tensorflow as tf  

a = tf.constant([

    [[1.0, 2.0, 3.0, 4.0],

     [5.0, 6.0, 7.0, 8.0],

     [8.0, 7.0, 6.0, 5.0],

     [4.0, 3.0, 2.0, 1.0]],

    [[4.0, 3.0, 2.0, 1.0],

     [8.0, 7.0, 6.0, 5.0],

     [1.0, 2.0, 3.0, 4.0],

     [5.0, 6.0, 7.0, 8.0]]

])

#reshape a,get the feature map [batch:1 height:2 width:2 channels:8]

a = tf.reshape(a, [1, 2, 2, 8])  

normal_a=tf.nn.local_response_normalization(a,2,0,1,1)

with tf.Session() as sess:

    print("feature map:")

    image = sess.run(a)

    print (image)

    print("normalized feature map:")

    normal = sess.run(normal_a)

    print (normal)

运行结果：

解释：

这里我取了n/2=2，k=0，α=1，β=1。公式中的N就是输入张量的通道总数：由a = tf.reshape(a, [1, 2, 2, 8]) 得到 N=8，变量i代表的是不同的通道，从0开始到7.

举个例子，比如对于一通道的第一个像素“1”来说，我们把参数代人公式就是1/(1^2+2^2+3^2)=0.07142857，对于四通道的第一个像素“4”来说，公式就是4/（2^2+3^2+4^2+5^2+6^2）=0.04444445，以此类推。转载：http://blog.csdn.net/mao_xiao_feng/article/details/53488271

TensorFlow激活函数+归一化-函数的更多相关文章

Tensorflow Batch normalization函数
Tensorflow Batch normalization函数觉得有用的话,欢迎一起讨论相互学习~Follow Me 参考文献 stackoverflow上tensorflow实现BN的不同函数的 ...
tensorflow.nn.bidirectional_dynamic_rnn()函数的用法
在分析Attention-over-attention源码过程中,对于tensorflow.nn.bidirectional_dynamic_rnn()函数的总结: 首先来看一下,函数: def bi ...
TensorFlow多层感知机函数逼近过程详解
http://c.biancheng.net/view/1924.html Hornik 等人的工作(http://www.cs.cmu.edu/~bhiksha/courses/deeplearni ...
TensorFlow——批量归一化操作
批量归一化在对神经网络的优化方法中,有一种使用十分广泛的方法——批量归一化,使得神经网络的识别准确度得到了极大的提升. 在网络的前向计算过程中,当输出的数据不再同一分布时,可能会使得loss的值非常 ...
TensorFlow从0到1之TensorFlow多层感知机函数逼近过程（23）
Hornik 等人的工作(http://www.cs.cmu.edu/~bhiksha/courses/deeplearning/Fall.2016/notes/Sonia_Hornik.pdf)证明 ...
TensorFlow常用的函数
TensorFlow中维护的集合列表在一个计算图中,可以通过集合(collection)来管理不同类别的资源.比如通过 tf.add_to_collection 函数可以将资源加入一个或多个集合中 ...
Tensorflow常用的函数:tf.cast
1.tf.cast(x,dtype,name) 此函数的目的是为了将x数据,准换为dtype所表示的类型,例如tf.float32,tf.bool,tf.uint8等 example: import ...
[转载]Tensorflow 的reduce_sum()函数的axis，keep_dim这些参数到底是什么意思？
转载链接:https://www.zhihu.com/question/51325408/answer/125426642来源:知乎这个问题无外乎有三个难点: 什么是sum 什么是reduce 什么 ...
查询tensorflow中的函数用法
一下均在ubuntu环境下: (1)方法一,使用help()函数: 比如对于tf.placeholder(),在命令行中输入import tensorflow as tf , help(tf.plac ...

随机推荐

任意N位二进制的补码实现——队列存放
正在学习计算机组织与结构,为了写一些底层的算术操作模拟,比如一个二进制补码数的加减乘除,发现这很麻烦,因为不管是什么语言,都只提供了8位.32.64位等部分位数的补码形式,那么怎么实现任意任意位的补码 ...
better-scroll项目中遇到的问题
1.在项目中发现个问题,用better-scroll实现的轮播图和页面滚动条俩个效果一起出现的时候,当鼠标或手指放在轮播图位置的时候,上下滚动的时候,页面滚动条不动发现最新的版本就会出这个问题,就是 ...
swift 学习- 20 -- 错误处理
// 错误处理是响应错误以及从错误中恢复的过程, Swift 提供了在运行时对可恢复错误的抛出, 捕获, 传递和操作的支持 // 某些操作无法保证总是执行完所有代码或总是生层有用结果, ...
Confluence 6 的小型文字档案（Cookies）
这个页面列出了存储在 Confluence 用户浏览器中的小型文字档案(Cookies)内容.这些内容是由 Confluence 自己创建的.这个页面不会列出由 Confluence 安装的第三方插件 ...
AFN 请求报 415错误解决方案
使用 AFHTTPSessionManager 发起请求时设置下面两句代码 manager.requestSerializer = [AFJSONRequestSerializer seriali ...
npm install Install error: Unexpected token < in JSON at position 35问题解决
解决方案 rm package-lock.json worked.
自己没有记住的一点小知识（ORM查询相关）
一.多对多的正反向查询 class Class(models.Model): name = models.CharField(max_length=32,verbose_name="班级名& ...
《剑指offer》调整数组顺序使得奇数在偶数前面
本题来自<剑指offer> 调整数组顺序使得奇数在偶数前面题目: 输入一个整数数组,实现一个函数来调整该数组中数字的顺序,使得所有的奇数位于数组的前半部分,所有的偶数位于数组的后半部分, ...
HTML&javaSkcript&CSS&jQuery&ajax（五）
一.Framset标签定义了每个框架中的HTML文档, 1. <framset cols="25%,75%"> <frame src="frame_a. ...
雅礼 noip2018 模拟赛 day3 T3
典型树形dp 这里,我们应该看到一些基本性质: ①:如果这个边不能改(不是没有必要改),我们就不改,因为就算改过去还要改回来,显然不是最优的注意:"不能改"是指边的性质和要求的相 ...

TensorFlow激活函数+归一化-函数

tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)

TensorFlow激活函数+归一化-函数的更多相关文章

随机推荐

热门专题

`tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)`