TensorFlow教程使用RNN生成唐诗

使用的数据集是全唐诗，首先提供一下数据集的下载链接：https://pan.baidu.com/s/13pNWfffr5HSN79WNb3Y0_w 提取码：koss

RNN不像传统的神经网络-它们的输出输出是固定的，而RNN允许我们输入输出向量序列。RNN是为了对序列数据进行建模而产生的。本帖代码移植自char-rnn，它是基于Torch的洋文模型，稍加修改即可应用于中文。char-rnn使用文本文件做为输入、训练RNN模型，然后使用它生成和训练数据类似的文本。

下边代码有修改，以适应TensorFlow1.4和GPU平台

 #coding=utf-8

 import collections

 import numpy as np

 import tensorflow as tf

 import io

 import sys

 import os

 reload(sys)

 sys.setdefaultencoding('utf-8')

 #-------------------------------数据预处理---------------------------#

 poetry_file ='poetry.txt'

 # 诗集

 poetrys = []

 with io.open(poetry_file, "r", encoding='utf-8',) as f:

     for line in f:

         # print line

         try:

             title, content = line.strip().split(':')

             content = content.replace(' ','')

             if '_' in content or '(' in content or '（' in content or '《' in content or '[' in content:

                 continue

             if len(content) < 5 or len(content) > 79:

                 continue

             content = '[' + content + ']'

             poetrys.append(content)

         except Exception as e:

             pass

 #按诗的字数排序

 poetrys = sorted(poetrys,key=lambda line: len(line))

 print(u"唐诗总数: ")

 print(len(poetrys))

 print(u"测试")

 # 统计每个字出现次数

 all_words = []

 for poetry in poetrys:

     all_words += [word for word in poetry]

 counter = collections.Counter(all_words)

 count_pairs = sorted(counter.items(), key=lambda x: -x[1])

 words, _ = zip(*count_pairs)

 # 取前多少个常用字

 words = words[:len(words)] + (' ',)

 # 每个字映射为一个数字ID

 word_num_map = dict(zip(words, range(len(words))))

 # 把诗转换为向量形式，参考TensorFlow练习1

 to_num = lambda word: word_num_map.get(word, len(words))

 poetrys_vector = [ list(map(to_num, poetry)) for poetry in poetrys]

 #[[314, 3199, 367, 1556, 26, 179, 680, 0, 3199, 41, 506, 40, 151, 4, 98, 1],

 #[339, 3, 133, 31, 302, 653, 512, 0, 37, 148, 294, 25, 54, 833, 3, 1, 965, 1315, 377, 1700, 562, 21, 37, 0, 2, 1253, 21, 36, 264, 877, 809, 1]

 #....]

 # 每次取64首诗进行训练

 batch_size = 64

 n_chunk = len(poetrys_vector) // batch_size

 x_batches = []

 y_batches = []

 for i in range(n_chunk):

     start_index = i * batch_size

     end_index = start_index + batch_size

     batches = poetrys_vector[start_index:end_index]

     length = max(map(len,batches))

     xdata = np.full((batch_size,length), word_num_map[' '], np.int32)

     for row in range(batch_size):

         xdata[row,:len(batches[row])] = batches[row]

     ydata = np.copy(xdata)

     ydata[:,:-1] = xdata[:,1:]

     """

     xdata             ydata

     [6,2,4,6,9]       [2,4,6,9,9]

     [1,4,2,8,5]       [4,2,8,5,5]

     """

     x_batches.append(xdata)

     y_batches.append(ydata)

 #---------------------------------------RNN--------------------------------------#

 input_data = tf.placeholder(tf.int32, [batch_size, None])

 output_targets = tf.placeholder(tf.int32, [batch_size, None])

 # 定义RNN

 def neural_network(model='lstm', rnn_size=128, num_layers=2):

     if model == 'rnn':

         cell_fun = tf.nn.rnn_cell.BasicRNNCell

     elif model == 'gru':

         cell_fun = tf.nn.rnn_cell.GRUCell

     elif model == 'lstm':

         cell_fun = tf.nn.rnn_cell.BasicLSTMCell

     cell = cell_fun(rnn_size, state_is_tuple=True)

     cell = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers, state_is_tuple=True)

     initial_state = cell.zero_state(batch_size, tf.float32)

     with tf.variable_scope('rnnlm'):

         softmax_w = tf.get_variable("softmax_w", [rnn_size, len(words)+1])

         softmax_b = tf.get_variable("softmax_b", [len(words)+1])

         with tf.device("/gpu:0"):

             embedding = tf.get_variable("embedding", [len(words)+1, rnn_size])

             inputs = tf.nn.embedding_lookup(embedding, input_data)

     outputs, last_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state, scope='rnnlm')

     output = tf.reshape(outputs,[-1, rnn_size])

     logits = tf.matmul(output, softmax_w) + softmax_b

     probs = tf.nn.softmax(logits)

     return logits, last_state, probs, cell, initial_state

 ckpt_dir="./ckpt_dir"

 if not os.path.exists(ckpt_dir):

     os.makedirs(ckpt_dir)

 #训练

 def train_neural_network():

     logits, last_state, _, _, _ = neural_network()

     targets = tf.reshape(output_targets, [-1])

     loss = tf.contrib.legacy_seq2seq.sequence_loss_by_example([logits], [targets], [tf.ones_like(targets, dtype=tf.float32)], len(words))

     cost = tf.reduce_mean(loss)

     learning_rate = tf.Variable(0.0, trainable=False)

     tvars = tf.trainable_variables()

     grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars), 5)

     optimizer = tf.train.AdamOptimizer(learning_rate)

     train_op = optimizer.apply_gradients(zip(grads, tvars))

     with tf.Session() as sess:

         sess.run(tf.initialize_all_variables())

         saver = tf.train.Saver(tf.all_variables())

         for epoch in range(295):

             sess.run(tf.assign(learning_rate, 0.002 * (0.97 ** epoch)))

             n = 0

             for batche in range(n_chunk):

                 train_loss, _ , _ = sess.run([cost, last_state, train_op], feed_dict={input_data: x_batches[n], output_targets: y_batches[n]})

                 n += 1

                 print(epoch, batche, train_loss)

             if epoch % 7 == 0:

                 saver.save(sess, ckpt_dir+'/poetry.module', global_step=epoch)

 train_neural_network()

这里我只说自己对bug调试和调优的一些想法，具体代码理解，请联系作者本人。

首先是#coding=utf-8的问题，这里是告诉python环境，当前python脚本的文字编码是utf-8,这里如果不调整的话，默认的ansii环境极有可能报告编码错误。

之后是数据集的utf-8编码问题，这里在encoding的时候，用了utf-8 的选项，但是却没有告诉python环境，字符集编码是utf-8，会导致每次解析到的content和title都会报错，最终处理完的数据集大小为0，设置sys的默认编码可以解决。

同时，默认的open函数没有encoding选项，这个是在io.open中的选项，这个地方需要修改。

还有一点是一些接口使用问题，比如saver.save现在需要一个parent directory

之后是预测的代码

 #coding=utf-8

 import collections

 import numpy as np

 import tensorflow as tf

 import io

 import sys

 import os

 import pdb

 import time

 reload(sys)

 sys.setdefaultencoding('utf-8')

 #-------------------------------数据预处理---------------------------#

 poetry_file ='poetry.txt'

 # 诗集

 poetrys = []

 with io.open(poetry_file, "r", encoding='utf-8',) as f:

     for line in f:

         try:

             title, content = line.strip().split(':')

             content = content.replace(' ','')

             if '_' in content or '(' in content or '（' in content or '《' in content or '[' in content:

                 continue

             if len(content) < 5 or len(content) > 79:

                 continue

             content = '[' + content + ']'

             poetrys.append(content)

         except Exception as e:

             pass

 # 按诗的字数排序

 poetrys = sorted(poetrys,key=lambda line: len(line))

 print(u'唐诗总数: ', len(poetrys))

 # 统计每个字出现次数

 all_words = []

 for poetry in poetrys:

     all_words += [word for word in poetry]

 counter = collections.Counter(all_words)

 count_pairs = sorted(counter.items(), key=lambda x: -x[1])

 words, _ = zip(*count_pairs)

 # 取前多少个常用字

 words = words[:len(words)] + (' ',)

 # 每个字映射为一个数字ID

 word_num_map = dict(zip(words, range(len(words))))

 # 把诗转换为向量形式

 to_num = lambda word: word_num_map.get(word, len(words))

 poetrys_vector = [ list(map(to_num, poetry)) for poetry in poetrys]

 #[[314, 3199, 367, 1556, 26, 179, 680, 0, 3199, 41, 506, 40, 151, 4, 98, 1],

 #[339, 3, 133, 31, 302, 653, 512, 0, 37, 148, 294, 25, 54, 833, 3, 1, 965, 1315, 377, 1700, 562, 21, 37, 0, 2, 1253, 21, 36, 264, 877, 809, 1]

 #....]

 batch_size = 1

 n_chunk = len(poetrys_vector) // batch_size

 x_batches = []

 y_batches = []

 for i in range(n_chunk):

     start_index = i * batch_size

     end_index = start_index + batch_size

     batches = poetrys_vector[start_index:end_index]

     length = max(map(len,batches))

     xdata = np.full((batch_size,length), word_num_map[' '], np.int32)

     for row in range(batch_size):

         xdata[row,:len(batches[row])] = batches[row]

     ydata = np.copy(xdata)

     ydata[:,:-1] = xdata[:,1:]

     """

     xdata             ydata

     [6,2,4,6,9]       [2,4,6,9,9]

     [1,4,2,8,5]       [4,2,8,5,5]

     """

     x_batches.append(xdata)

     y_batches.append(ydata)

 #---------------------------------------RNN--------------------------------------#

 input_data = tf.placeholder(tf.int32, [batch_size, None])

 output_targets = tf.placeholder(tf.int32, [batch_size, None])

 # 定义RNN

 def neural_network(model='lstm', rnn_size=128, num_layers=2):

     if model == 'rnn':

         cell_fun = tf.nn.rnn_cell.BasicRNNCell

     elif model == 'gru':

         cell_fun = tf.nn.rnn_cell.GRUCell

     elif model == 'lstm':

         cell_fun = tf.nn.rnn_cell.BasicLSTMCell

     cell = cell_fun(rnn_size, state_is_tuple=True)

     cell = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers, state_is_tuple=True)

     initial_state = cell.zero_state(batch_size, tf.float32)

     with tf.variable_scope('rnnlm'):

         softmax_w = tf.get_variable("softmax_w", [rnn_size, len(words)+1])

         softmax_b = tf.get_variable("softmax_b", [len(words)+1])

         with tf.device("/gpu:0"):

             embedding = tf.get_variable("embedding", [len(words)+1, rnn_size])

             inputs = tf.nn.embedding_lookup(embedding, input_data)

     outputs, last_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state, scope='rnnlm')

     output = tf.reshape(outputs,[-1, rnn_size])

     logits = tf.matmul(output, softmax_w) + softmax_b

     probs = tf.nn.softmax(logits)

     return logits, last_state, probs, cell, initial_state

 #-------------------------------生成古诗---------------------------------#

 # 使用训练完成的模型

 def gen_poetry():

     def to_word(weights):

         t = np.cumsum(weights)

         s = np.sum(weights)

         sample = int(np.searchsorted(t, np.random.rand(1)*s))

         return words[sample]

     _, last_state, probs, cell, initial_state = neural_network()

     with tf.Session() as sess:

         sess.run(tf.initialize_all_variables())

         saver = tf.train.Saver(tf.all_variables())

         saver.restore(sess, './ckpt_dir/poetry.module-294')

         state_ = sess.run(cell.zero_state(1, tf.float32))

         x = np.array([list(map(word_num_map.get, '['))])

         [probs_, state_] = sess.run([probs, last_state], feed_dict={input_data: x, initial_state: state_})

         word = to_word(probs_)

         #word = words[np.argmax(probs_)]

         poem = ''

         while word != ']':

             poem += word

             x = np.zeros((1,1))

             x[0,0] = word_num_map[word]

             [probs_, state_] = sess.run([probs, last_state], feed_dict={input_data: x, initial_state: state_})

             word = to_word(probs_)

             #word = words[np.argmax(probs_)]

         return poem

 def gen_poetry_with_head(head):

     def to_word(weights):

         t = np.cumsum(weights)

         s = np.sum(weights)

         sample = int(np.searchsorted(t, np.random.rand(1)*s))

         return words[sample]

     _, last_state, probs, cell, initial_state = neural_network()

     with tf.Session() as sess:

         sess.run(tf.initialize_all_variables())

         saver = tf.train.Saver(tf.all_variables())

         saver.restore(sess, './ckpt_dir/poetry.module-294')

         state_ = sess.run(cell.zero_state(1, tf.float32))

         poem = ''

         i = 0

         # print head

         # pdb.set_trace()

         for word in head:

             while word != '，' and word != '。':

                 poem += word

                 # print poem

                 # print head

                 # print word

                 x = np.array([list(map(word_num_map.get, word))])

                 [probs_, state_] = sess.run([probs, last_state], feed_dict={input_data: x, initial_state: state_})

                 word = to_word(probs_)

                 time.sleep(1)

             if i % 2 == 0:

                 poem += '，'

             else:

                 poem += '。'

             i += 1

         return poem

 print(gen_poetry())

 # print(gen_poetry_with_head(u'一二三四'))

这个藏头诗的代码用法有问题，不建议使用，我调了很久才调好，这次还是先列原作者的代码，下次单独说这块的调整和调优问题。

结果：

有那么点意思，但仔细看问题还是很大，胡言乱语，模型的调优远远不行。

TensorFlow教程使用RNN生成唐诗的更多相关文章

Tensorflow生成唐诗和歌词（下）
整个工程使用的是Windows版pyCharm和tensorflow. 源码地址:https://github.com/Irvinglove/tensorflow_poems/tree/master ...
Tensorflow生成唐诗和歌词（上）
整个工程使用的是Windows版pyCharm和tensorflow. 源码地址:https://github.com/Irvinglove/tensorflow_poems/tree/master ...
TensorFlow练习7: 基于RNN生成古诗词
http://blog.topspeedsnail.com/archives/10542 主题 TensorFlow RNN不像传统的神经网络-它们的输出输出是固定的,而RNN允许我们输入输出向量 ...
MVC5+EF6 入门完整教程13 -- 动态生成多级菜单
稍微有一定复杂性的系统,多级菜单都是一个必备组件. 本篇专题讲述如何生成动态多级菜单的通用做法. 我们不用任何第三方的组件,完全自己构建灵活通用的多级菜单. 需要达成的效果:容易复用,可以根据mode ...
Pytorch基础——使用 RNN 生成简单序列
一.介绍内容使用 RNN 进行序列预测今天我们就从一个基本的使用 RNN 生成简单序列的例子中,来窥探神经网络生成符号序列的秘密. 我们首先让神经网络模型学习形如 0^n 1^n 形式的上下文无 ...
Pytorch系列教程-使用字符级RNN生成姓名
前言本系列教程为pytorch官网文档翻译.本文对应官网地址:https://pytorch.org/tutorials/intermediate/char_rnn_generation_tutor ...
tensorflow教程:tf.contrib.rnn.DropoutWrapper
tf.contrib.rnn.DropoutWrapper Defined in tensorflow/python/ops/rnn_cell_impl.py. def __init__(self, ...
Tensorflow学习教程------tfrecords数据格式生成与读取
首先是生成tfrecords格式的数据,具体代码如下: #coding:utf-8 import os import tensorflow as tf from PIL import Image cw ...
Tensorflow-3-使用RNN生成中文小说
https://blog.csdn.net/heisejiuhuche/article/details/73010638 这篇文章不涉及RNN的基本原理,只是从选择数据集开始,到最后生成文本,展示一个 ...

随机推荐

Hibernate3核心API简介-Transaction接口
代表一次原子操作,它具有数据库事务的概念.所有持久层都应该在事务管理下进行,即使是只读操作. Transaction tx = session.beginTransaction();常用方法:c ...
使用Pull解析器生成XML文件
有些时候,我们需要生成一个XML文件,生成XML文件的方法有很多,如:可以只使用一个StringBuilder组拼XML内容,然后把内容写入到文件中:或者使用DOM API生成XML文件,或者也可以使 ...
RabbitMQ学习之：（十一）AMQP.0-10规范，中文翻译1,2,3章（转载）
From:http://blog.sina.com.cn/s/blog_4aba0c8b0100p6ho.html From: http://blog.sina.com.cn/s/blog_4aba0 ...
小D课堂-SpringBoot 2.x微信支付在线教育网站项目实战_5-2.微信扫一扫功能开发前期准备
笔记 2.微信扫一扫功能开发前期准备简介:讲解微信扫一扫功能相关开发流程和资料准备 1.微信开放平台介绍(申请里面的网站应用需要企业资料) ...
docker网络（3）
docker网络介绍大量的互联网应用服务需要多个服务组件,这往往需要多个容器之间通过网络通信进行相互配合. docker 网络从覆盖范围可分为单个 host 上的容器网络和跨多个 host 的网络. ...
OpenCV画图(画OpenCV的标志)
import numpy as np import cv2 img = np.ones((512, 512, 3), np.uint8)*255 # 画椭圆 # 图片 (圆心) (短轴长,长轴长),旋 ...
#dokcer部署code-server web版vscode+golang
codercom/code-server:latest不支持插件在线安装 codercom/code-server:v2目前为最新版1. #创建 docker rm -f vscode docker ...
监控数据库DDL操作日志
背景为了监控好生产环境下各个数据库服务器上DDL操作日志,便于运维工程师管控好风险,我们有必要关注当前实例下的所有的DDL操作以及对应的IP和hostname. 测试环境 Microsoft SQL ...
Java集合(7)：散列与散列码
散列的价值在于速度.我们使用数组来保存键的信息,这个信息并不是键本身,而是通过键对象生成一个数字(散列码),作为数组下标.由于数组的容量是固定的,而散列容器的大小是可变的,所以不同的键可以产生相同的数 ...
dtcms 手机浏览
private string GetSitePath(string webPath, string requestPath, string requestDomain) { //获取当前域名包含的站点 ...

TensorFlow教程使用RNN生成唐诗

TensorFlow教程使用RNN生成唐诗的更多相关文章

随机推荐

热门专题