Tensorflow --BeamSearch

github:https://github.com/zle1992/Seq2Seq-Chatbot

1、注意在infer阶段，需要需要reuse，

2、If you are using the BeamSearchDecoder with a cell wrapped in AttentionWrapper, then you must ensure that:

The encoder output has been tiled to beam_width via tf.contrib.seq2seq.tile_batch (NOT tf.tile).
The batch_size argument passed to the zero_state method of this wrapper is equal to true_batch_size * beam_width.
The initial state created with zero_state above contains a cell_state value containing properly tiled final state from the encoder.

 import tensorflow as tf

 from tensorflow.python.layers.core import Dense

 BEAM_WIDTH = 5

 BATCH_SIZE = 128

 # INPUTS

 X = tf.placeholder(tf.int32, [BATCH_SIZE, None])

 Y = tf.placeholder(tf.int32, [BATCH_SIZE, None])

 X_seq_len = tf.placeholder(tf.int32, [BATCH_SIZE])

 Y_seq_len = tf.placeholder(tf.int32, [BATCH_SIZE])

 # ENCODER

 encoder_out, encoder_state = tf.nn.dynamic_rnn(

     cell = tf.nn.rnn_cell.BasicLSTMCell(128),

     inputs = tf.contrib.layers.embed_sequence(X, 10000, 128),

     sequence_length = X_seq_len,

     dtype = tf.float32)

 # DECODER COMPONENTS

 Y_vocab_size = 10000

 decoder_embedding = tf.Variable(tf.random_uniform([Y_vocab_size, 128], -1.0, 1.0))

 projection_layer = Dense(Y_vocab_size)

 # ATTENTION (TRAINING)

 with tf.variable_scope('shared_attention_mechanism'):

     attention_mechanism = tf.contrib.seq2seq.LuongAttention(

         num_units = 128,

         memory = encoder_out,

         memory_sequence_length = X_seq_len)

 decoder_cell = tf.contrib.seq2seq.AttentionWrapper(

     cell = tf.nn.rnn_cell.BasicLSTMCell(128),

     attention_mechanism = attention_mechanism,

     attention_layer_size = 128)

 # DECODER (TRAINING)

 training_helper = tf.contrib.seq2seq.TrainingHelper(

     inputs = tf.nn.embedding_lookup(decoder_embedding, Y),

     sequence_length = Y_seq_len,

     time_major = False)

 training_decoder = tf.contrib.seq2seq.BasicDecoder(

     cell = decoder_cell,

     helper = training_helper,

     initial_state = decoder_cell.zero_state(BATCH_SIZE,tf.float32).clone(cell_state=encoder_state),

     output_layer = projection_layer)

 with tf.variable_scope('decode_with_shared_attention'):

     training_decoder_output, _, _ = tf.contrib.seq2seq.dynamic_decode(

         decoder = training_decoder,

         impute_finished = True,

         maximum_iterations = tf.reduce_max(Y_seq_len))

 training_logits = training_decoder_output.rnn_output

 # BEAM SEARCH TILE

 encoder_out = tf.contrib.seq2seq.tile_batch(encoder_out, multiplier=BEAM_WIDTH)

 X_seq_len = tf.contrib.seq2seq.tile_batch(X_seq_len, multiplier=BEAM_WIDTH)

 encoder_state = tf.contrib.seq2seq.tile_batch(encoder_state, multiplier=BEAM_WIDTH)

 # ATTENTION (PREDICTING)

 with tf.variable_scope('shared_attention_mechanism', reuse=True):

     attention_mechanism = tf.contrib.seq2seq.LuongAttention(

         num_units = 128,

         memory = encoder_out,

         memory_sequence_length = X_seq_len)

 decoder_cell = tf.contrib.seq2seq.AttentionWrapper(

     cell = tf.nn.rnn_cell.BasicLSTMCell(128),

     attention_mechanism = attention_mechanism,

     attention_layer_size = 128)

 # DECODER (PREDICTING)

 predicting_decoder = tf.contrib.seq2seq.BeamSearchDecoder(

     cell = decoder_cell,

     embedding = decoder_embedding,

     start_tokens = tf.tile(tf.constant([1], dtype=tf.int32), [BATCH_SIZE]),

     end_token = 2,

     initial_state = decoder_cell.zero_state(BATCH_SIZE * BEAM_WIDTH,tf.float32).clone(cell_state=encoder_state),

     beam_width = BEAM_WIDTH,

     output_layer = projection_layer,

     length_penalty_weight = 0.0)

 with tf.variable_scope('decode_with_shared_attention', reuse=True):

     predicting_decoder_output, _, _ = tf.contrib.seq2seq.dynamic_decode(

         decoder = predicting_decoder,

         impute_finished = False,

         maximum_iterations = 2 * tf.reduce_max(Y_seq_len))

 predicting_logits = predicting_decoder_output.predicted_ids[:, :, 0]

 print('successful')

参考：

https://gist.github.com/higepon/eb81ba0f6663a57ff1908442ce753084

https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BeamSearchDecoder

https://github.com/tensorflow/nmt#beam-search

Tensorflow --BeamSearch的更多相关文章

tensorflow 笔记13：了解机器翻译，google NMT，Attention
一.关于Attention,关于NMT 未完待续... 以google 的 nmt 代码引入探讨下端到端: 项目地址:https://github.com/tensorflow/nmt 机器翻译算是 ...
Effective Tensorflow[转]
Effective TensorFlow Table of Contents TensorFlow Basics Understanding static and dynamic shapes Sco ...
Tensorflow 官方版教程中文版
2015年11月9日,Google发布人工智能系统TensorFlow并宣布开源,同日,极客学院组织在线TensorFlow中文文档翻译.一个月后,30章文档全部翻译校对完成,上线并提供电子书下载,该 ...
tensorflow学习笔记二：入门基础
TensorFlow用张量这种数据结构来表示所有的数据.用一阶张量来表示向量,如:v = [1.2, 2.3, 3.5] ,如二阶张量表示矩阵,如:m = [[1, 2, 3], [4, 5, 6], ...
用Tensorflow让神经网络自动创造音乐
#————————————————————————本文禁止转载,禁止用于各类讲座及ppt中,违者必究————————————————————————# 前几天看到一个有意思的分享,大意是讲如何用Ten ...
tensorflow 一些好的blog链接和tensorflow gpu版本安装
pading :SAME,VALID 区别 http://blog.csdn.net/mao_xiao_feng/article/details/53444333 tensorflow实现的各种算法 ...
tensorflow中的基本概念
本文是在阅读官方文档后的一些个人理解. 官方文档地址:https://www.tensorflow.org/versions/r0.12/get_started/basic_usage.html#ba ...
kubernetes&tensorflow
谷歌内部--Borg Google Brain跑在数十万台机器上谷歌电商商品分类深度学习模型跑在1000+台机器上谷歌外部--Kubernetes(https://github.com/kuber ...
tensorflow学习
tensorflow安装时遇到gcc: error trying to exec 'as': execvp: No such file or directory. 截止到2016年11月13号,源码编 ...

随机推荐

STL复习之 map & vector --- disney HDU 2142
题目链接: https://vjudge.net/problem/40913/origin 大致题意: 这是一道纯模拟题,不多说了. 思路: map模拟,vector辅助其中用了map的函数: er ...
SSH（Spring Struts2 Hibernate）框架整合(xml版)
案例描述:使用SSH整合框架实现部门的添加功能工程: Maven 数据库:Oracle 案例架构: 1.依赖jar包pom.xml <project xmlns="http://ma ...
spring-boot-maven-plugin 安装本地jar 包
本地使用nexus 进行maven仓库管理.项目deploy 引入之后,总是找不到jar中定义的class或者配置文件等. 从截图上可以看到虽然class文件是有的,但是引用的时候却是找不到的. Sp ...
[LeetCode] Global and Local Inversions 全局与局部的倒置
We have some permutation A of [0, 1, ..., N - 1], where N is the length of A. The number of (global) ...
Gparted Live分区调整
由于年少无知,在安装ubuntu系统的时候,以为/temp是软件包安装时解压的缓冲,所以给/temp留了10G,而以为/var只是记录一些log而已,因此把仅存的1G分配给了它.随后在安装软件时出现“ ...
nodejs----安装配置
Node.js 安装配置 Node.js 安装包及源码下载地址为:https://nodejs.org/en/download/. 你可以根据不同平台系统选择你需要的 Node.js 安装包. Nod ...
filter 过滤器的基本使用
<!DOCTYPE html> <html lang="zh"> <head> <meta charset="UTF-8&quo ...
用SharedPreference或文件的方式存储数据
一.用SharedPreference存储数据当程序有少量的数据需要保存,而这些数据的格式比较简单(例如一些配置信息),这个时候就可以使用SharedPreference来进行保存下面例子将演示向 ...
Autofac之类型注册
本次主要学习一下Autofac中实现类型注册的几种方式,这里并不打算一开始就从基于接口开发的服务关联切入,而是先从一个简单的类型注册来学起,虽然实际开发中可能不会这么做,但是个人感觉从这里学起理解能能 ...
WangEditor+thinkphp5【真实可用+原创】
今天公司要编辑文章,一开始准备用ueditor,但是到了linux环境下一直不行,所以最终放弃.改用另外一个编辑器WangEditor.更加轻量级. 遇到最大的问题是一个是图片上传,一个是div中的 ...

Tensorflow --BeamSearch

Tensorflow --BeamSearch的更多相关文章

随机推荐

热门专题