【转载】【TensorFlow】static_rnn 和dynamic

原文地址：

https://blog.csdn.net/qq_20135597/article/details/88980975

---------------------------------------------------------------------------------------------

tensorflow中提供了rnn接口有两种，一种是静态的rnn，一种是动态的rnn

通常用法：

1、静态接口：static_rnn

主要使用 tf.contrib.rnn

x = tf.placeholder("float", [None, n_steps, n_input])

x1 = tf.unstack(x, n_steps, 1)

lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, x1, dtype=tf.float32)

pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None)

静态 rnn 的意思就是在图中创建一个固定长度（n_steps）的网络。这将导致

缺点：

生成过程耗时更长，占内存更多，导出的模型更大；
无法传递比最初指定的更长的序列（> n_steps）。

优点：

模型中带有某个序列中间台的信息，便与调试。

2、动态接口：dynamic_rnn

主要使用 tf.nn.dynamic_rnn

x = tf.placeholder("float", [None, n_steps, n_input])

lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

outputs,_  = tf.nn.dynamic_rnn(lstm_cell ,x,dtype=tf.float32)

outputs = tf.transpose(outputs, [1, 0, 2])

pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None)

动态的tf.nn.dynamic_rnn被执行时，它使用循环来动态构建图形。这意味着

优点：

图形创建速度更快，占用内存更少；

并且可以提供可变大小的批处理。

缺点：

模型中只有最后的状态。

动态rnn的意思是只创建样本中的一个序列RNN，其他序列数据会通过循环进入该RNN运算

区别：

1、输入输出不同：

dynamic_rnn实现的功能就是可以让不同迭代传入的batch可以是长度不同数据，但同一次迭代一个batch内部的所有数据长度仍然是固定的。例如，第一时刻传入的数据shape=[batch_size, 10]，第二时刻传入的数据shape=[batch_size, 12]，第三时刻传入的数据shape=[batch_size, 8]等等。

但是static_rnn不能这样，它要求每一时刻传入的batch数据的[batch_size, max_seq]，在每次迭代过程中都保持不变。

2、训练方式不同：

具体参见参考文献1

多层LSTM的代码实现对比：

1、静态多层RNN

import tensorflow as tf

# 导入 MINST 数据集

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("c:/user/administrator/data/", one_hot=True)

n_input = 28 # MNIST data 输入 (img shape: 28*28)

n_steps = 28 # timesteps

n_hidden = 128 # hidden layer num of features

n_classes = 10  # MNIST 列别 (0-9 ，一共10类)

batch_size = 128

tf.reset_default_graph()

# tf Graph input

x = tf.placeholder("float", [None, n_steps, n_input])

y = tf.placeholder("float", [None, n_classes])

gru = tf.contrib.rnn.GRUCell(n_hidden*2)

lstm_cell = tf.contrib.rnn.LSTMCell(n_hidden)

mcell = tf.contrib.rnn.MultiRNNCell([lstm_cell,gru])

x1 = tf.unstack(x, n_steps, 1)

outputs, states = tf.contrib.rnn.static_rnn(mcell, x1, dtype=tf.float32)

pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None)

learning_rate = 0.001

# Define loss and optimizer

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=pred, labels=y))

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model

correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))

accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

training_iters = 100000

display_step = 10

# 启动session

with tf.Session() as sess:

    sess.run(tf.global_variables_initializer())

    step = 1

    # Keep training until reach max iterations

    while step * batch_size < training_iters:

        batch_x, batch_y = mnist.train.next_batch(batch_size)

        # Reshape data to get 28 seq of 28 elements

        batch_x = batch_x.reshape((batch_size, n_steps, n_input))

        # Run optimization op (backprop)

        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})

        if step % display_step == 0:

            # 计算批次数据的准确率

            acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})

            # Calculate batch loss

            loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})

            print ("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \

                  "{:.6f}".format(loss) + ", Training Accuracy= " + \

                  "{:.5f}".format(acc))

        step += 1

    print (" Finished!")

    # 计算准确率 for 128 mnist test images

    test_len = 100

    test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))

    test_label = mnist.test.labels[:test_len]

    print ("Testing Accuracy:", \

        sess.run(accuracy, feed_dict={x: test_data, y: test_label}))

2、动态多层RNN

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("c:/user/administrator/data/", one_hot=True)

n_input = 28 # MNIST data 输入 (img shape: 28*28)

n_steps = 28 # timesteps

n_hidden = 128 # hidden layer num of features

n_classes = 10  # MNIST 列别 (0-9 ，一共10类)

batch_size = 128

tf.reset_default_graph()

# tf Graph input

x = tf.placeholder("float", [None, n_steps, n_input])

y = tf.placeholder("float", [None, n_classes])

gru = tf.contrib.rnn.GRUCell(n_hidden*2)

lstm_cell = tf.contrib.rnn.LSTMCell(n_hidden)

mcell = tf.contrib.rnn.MultiRNNCell([lstm_cell,gru])

outputs,states  = tf.nn.dynamic_rnn(mcell,x,dtype=tf.float32)#(?, 28, 256)

outputs = tf.transpose(outputs, [1, 0, 2])#(28, ?, 256) 28个时序，取最后一个时序outputs[-1]=(?,256)

pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None)

learning_rate = 0.001

# Define loss and optimizer

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=pred, labels=y))

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model

correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))

accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

training_iters = 100000

display_step = 10

# 启动session

with tf.Session() as sess:

    sess.run(tf.global_variables_initializer())

    step = 1

    # Keep training until reach max iterations

    while step * batch_size < training_iters:

        batch_x, batch_y = mnist.train.next_batch(batch_size)

        # Reshape data to get 28 seq of 28 elements

        batch_x = batch_x.reshape((batch_size, n_steps, n_input))

        # Run optimization op (backprop)

        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})

        if step % display_step == 0:

            # 计算批次数据的准确率

            acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})

            # Calculate batch loss

            loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})

            print ("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \

                  "{:.6f}".format(loss) + ", Training Accuracy= " + \

                  "{:.5f}".format(acc))

        step += 1

    print (" Finished!")

    # 计算准确率 for 128 mnist test images

    test_len = 100

    test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))

    test_label = mnist.test.labels[:test_len]

    print ("Testing Accuracy:", \

        sess.run(accuracy, feed_dict={x: test_data, y: test_label}))

【参考文献】：

1、https://www.jianshu.com/p/1b1ea45fab47

2、What's the difference between tensorflow dynamic_rnn and rnn?

------------------------------------------------------------------------

【转载】【TensorFlow】static_rnn 和dynamic_rnn的区别的更多相关文章

【转载】 LSTM构建步骤以及static_rnn与dynamic_rnn之间的区别
原文地址: https://blog.csdn.net/qq_23981335/article/details/89097757 --------------------- 作者:周卫林来源:CSD ...
转载:Ajax及 GET、POST 区别
转载:Ajax及 GET.POST 区别收获: xhr.setRequestHeader(), xhr.getResponseHeader() 可以设置和获取请求头/响应头信息; new FormD ...
[转载]java int与integer的区别
声明: 本篇文章属于转载文章,来源:
【转载】new和malloc的区别
本篇随笔为转载,原贴地址:C++中new和malloc的十点区别. 前言几个星期前去面试C++研发的实习岗位,面试官问了个问题: new与malloc有什么区别? 这是个老生常谈的问题.当时我回答n ...
【转载】gcc和g++的区别
[说明]本文转载自静心的文章 http://blog.163.com/lu_jun520/blog/static/5699613420116205148239/ 一般linux系统都自带了gcc编 ...
Java_类和对象（完美总结）_转载_覆盖和隐藏的区别，覆盖就不能使用了，而隐藏提供全局方法名或者全局变量名还可以使用
转载自海子:http://www.cnblogs.com/dolphin0520/p/3803432.html Java:类与继承对于面向对象的程序设计语言来说,类毫无疑问是其最重要的基础.抽象.封 ...
[转载] Rss 与 Feed 的概念区别
转载自http://www.chinaz.com/news/2011/0831/207961.shtml 可能很多刚刚接触博客的童鞋们,也和我一样不太了解:rss和feed概念或者说不了解rss和fe ...
[转载]Tensorflow 的reduce_sum()函数的axis，keep_dim这些参数到底是什么意思？
转载链接:https://www.zhihu.com/question/51325408/answer/125426642来源:知乎这个问题无外乎有三个难点: 什么是sum 什么是reduce 什么 ...
tensorflow 笔记12：函数区别：placeholder，variable，get_variable，参数共享
一.函数意义: 1.tf.Variable() 变量 W = tf.Variable(<initial-value>, name=<optional-name>) 用于生成一个 ...

随机推荐

Linux用户组和权限管理
Linux用户组和权限管理作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Linux的安全模型 1>.安全3A 这并不是Linux特有的概念,在很多领域都有3A的概念 ...
SD介绍
1. 介绍 MMC,MultiMediaCard,即多媒体卡,是一种非易失性存储器件,有7pin,目前已基本被SD卡代替 eMMC,Embedded Multimedia Card,内嵌式存储器,以B ...
IntelliJ IDEA自身以及maven项目打包方式
1. Idea自身打包方式 1.1 创建Artifacts 快捷键(Ctrl+Alt+Shift+S)打开项目的Project Structure.在Artifacts创建接着,指定main cla ...
在vue项目中使用axios
安装 cnpm i axios --save-dev 在项目main.js中全局引用 import axios from "axios" Vue.prototype.$http=a ...
CodeForces - 24D ：Broken robot （DP+三对角矩阵高斯消元随机）
pro:给定N*M的矩阵,以及初始玩家位置. 规定玩家每次会等概率的向左走,向右走,向下走,原地不动,问走到最后一行的期望.保留4位小数. sol:可以列出方程,高斯消元即可,发现是三角矩阵,O(N* ...
Jenkins拉取github库代码执行构建
前言上篇文章写了关于定时构建,以及构建后发送邮件的内容,但是构建时运行的代码是我们手动添加到Jenkins工作空间的.这篇文章我们说一说自动从GitHub远程库拉取代码,执行构建,废话不多说,开始! ...
hexo与github page搭建博客
安装 npm i hexo-cli -g hexo init blog cd blog npm install hexo server 发布hexo到github page npm i hexo-de ...
SpringCloud微服务
SpringCloud SpringCloud 为开发人员提供了快速构建分布式系统的一些工具,包括配置管理.服务发现.断路器.路由.负载均衡.微代理.事件总线.全局锁.决策竞选.分布式会话等等.它运行 ...
windows客户端
Mongo 安装及基本操作
一. 安装 Mongo文档: https://docs.mongodb.com/v3.6/administration/install-enterprise-linux/ Linux mongo的配置 ...

【转载】 【TensorFlow】static_rnn 和dynamic_rnn的区别

通常用法：

多层LSTM的代码实现对比：

【转载】 【TensorFlow】static_rnn 和dynamic_rnn的区别的更多相关文章

随机推荐

热门专题

【转载】【TensorFlow】static_rnn 和dynamic_rnn的区别

【转载】【TensorFlow】static_rnn 和dynamic_rnn的区别的更多相关文章