RNN适用场景

循环神经网络(Recurrent Neural Network)适合处理和预测时序数据

RNN的特点

RNN的隐藏层之间的节点是有连接的,他的输入是输入层的输出向量.extend(上一时刻隐藏层的状态向量)。

demo:单层全连接网络作为循环体的RNN

输入层维度:x

隐藏层维度:h

每个循环体的输入大小为:x+h

每个循环体的输出大小为:h

循环体的输出有两个用途:

  1. 下一时刻循环体的输入的一部分
  2. 经过另一个全连接神经网络,得到当前时刻的输出

序列长度

理论上RNN支持任意序列长度,但过长会导致优化时梯度消散的问题,因此一般都设定一个最大长度。超过该长度是,进行截断。

论文原文:On the difficulty of training Recurrent Neural Networks

长短时记忆网络(LSTM结构)

论文原文:Long Short-term memory

循环体:拥有输入门、遗忘门、输出门的特殊网络结构

遗忘门:决定忘记当前输入、上一时刻状态和上一时刻输出中的哪一部分

输入门:决定当前输入、上一时刻状态、上一时刻输出中,哪些部分将进入当前时刻的状态

RNN的变种

  1. 双向RNN
  2. 深层RNN

RNN的dropout

不同层的循环体之间使用dropout,同一层循环体之间不使用dropout

demo

import os
import re
import io
import requests
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from zipfile import ZipFile
from tensorflow.python.framework import ops
ops.reset_default_graph()

about zipfile

1. start a graph session and set RNN parameters

sess = tf.Session()

epochs = 20 # run 20 epochs. An epoch equals to all batches of this training set.
batch_size = 250
max_sequence_length = 25
rnn_size = 10 # The RNN will be of size 10 units.
embedding_size = 50 # every word will be embedded in a trainable vector of size 50
min_word_frequency = 10 # We will only consider words that appear at least 10 times in our vocabulary
learning_rate = 0.0005
dropout_keep_prob = tf.placeholder(tf.float32)

2. Download or open data

Check if it was already downloaded and, if so,read in the file.

Otherwise, download the data and save it

# Download or open data

data_dir = 'data'

data_file = 'text_data.txt'

if not os.path.exists(data_dir):

    os.makedirs(data_dir)

if not os.path.isfile(os.path.join(data_dir, data_file)):

    zip_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip'

    r = requests.get(zip_url)

    z = ZipFile(io.BytesIO(r.content))

    file = z.read('SMSSpamCollection')

    # Format Data

    text_data = file.decode()

    text_data = text_data.encode('ascii',errors='ignore')

    text_data = text_data.decode().split('\n')

    # Save data to text file

    with open(os.path.join(data_dir, data_file), 'w') as file_conn:

        for text in text_data:

            file_conn.write("{}\n".format(text)) # append "\n" to each row. Format method is from re lib. 

else:

    # Open data from text file

    text_data = []

    with open(os.path.join(data_dir, data_file), 'r') as file_conn:

        for row in file_conn:

            text_data.append(row)

    text_data = text_data[:-1]

text_data = [x.split('\t') for x in text_data if len(x)>=1]

[text_data_target, text_data_train] = [list(x) for x in zip(*text_data)]

3. Create a text cleaning function then clean the data

def clean_text(text_string):

    text_string = re.sub(r'([^\s\w]|_|[0-9])+', '', text_string) # \w匹配包括下划线的任何单词字符 [^\s\w]匹配空格开头字符串

    text_string = " ".join(text_string.split())

    text_string = text_string.lower()

    return(text_string)

# Clean texts

text_data_train = [clean_text(x) for x in text_data_train]

4. Change texts into numeric vectors

This will convert a text to an appropriate list of indices


x_shuffled = text_processed[shuffled_ix] y_shuffled = text_data_target[shuffled_ix] # Split train/test set ix_cutoff = int(len(y_shuffled)*0.80) x_train, x_test = x_shuffled[:ix_cutoff], x_shuffled[ix_cutoff:] y_train, y_test = y_shuffled[:ix_cutoff], y_shuffled[ix_cutoff:] vocab_size = len(vocab_processor.vocabulary_) print("Vocabulary Size: {:d}".format(vocab_size)) print("80-20 Train Test split: {:d} -- {:d}".format(len(y_train), len(y_test))) # Create placeholders x_data = tf.placeholder(tf.int32, [None, max_sequence_length]) y_output = tf.placeholder(tf.int32, [None]) # Create embedding embedding_mat = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0)) embedding_output = tf.nn.embedding_lookup(embedding_mat, x_data) #embedding_output_expanded = tf.expand_dims(embedding_output, -1) # Define the RNN cell #tensorflow change >= 1.0, rnn is put into tensorflow.contrib directory. Prior version not test. if tf.__version__[0]>='1': cell=tf.contrib.rnn.BasicRNNCell(num_units = rnn_size) else: cell = tf.nn.rnn_cell.BasicRNNCell(num_units = rnn_size) output, state = tf.nn.dynamic_rnn(cell, embedding_output, dtype=tf.float32) output = tf.nn.dropout(output, dropout_keep_prob) # Get output of RNN sequence output = tf.transpose(output, [1, 0, 2]) last = tf.gather(output, int(output.get_shape()[0]) - 1) weight = tf.Variable(tf.truncated_normal([rnn_size, 2], stddev=0.1)) bias = tf.Variable(tf.constant(0.1, shape=[2])) logits_out = tf.matmul(last, weight) + bias # Loss function losses = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits_out, labels=y_output) # logits=float32, labels=int32 loss = tf.reduce_mean(losses) accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(logits_out, 1), tf.cast(y_output, tf.int64)), tf.float32)) optimizer = tf.train.RMSPropOptimizer(learning_rate) train_step = optimizer.minimize(loss) init = tf.global_variables_initializer() sess.run(init) train_loss = [] test_loss = [] train_accuracy = [] test_accuracy = [] # Start training for epoch in range(epochs): # Shuffle training data shuffled_ix = np.random.permutation(np.arange(len(x_train))) x_train = x_train[shuffled_ix] y_train = y_train[shuffled_ix] num_batches = int(len(x_train)/batch_size) + 1 # TO DO CALCULATE GENERATIONS ExACTLY for i in range(num_batches): # Select train data min_ix = i * batch_size max_ix = np.min([len(x_train), ((i+1) * batch_size)]) x_train_batch = x_train[min_ix:max_ix] y_train_batch = y_train[min_ix:max_ix] # Run train step train_dict = {x_data: x_train_batch, y_output: y_train_batch, dropout_keep_prob:0.5} sess.run(train_step, feed_dict=train_dict) # Run loss and accuracy for training temp_train_loss, temp_train_acc = sess.run([loss, accuracy], feed_dict=train_dict) train_loss.append(temp_train_loss) train_accuracy.append(temp_train_acc) # Run Eval Step test_dict = {x_data: x_test, y_output: y_test, dropout_keep_prob:1.0} temp_test_loss, temp_test_acc = sess.run([loss, accuracy], feed_dict=test_dict) test_loss.append(temp_test_loss) test_accuracy.append(temp_test_acc) print('Epoch: {}, Test Loss: {:.2}, Test Acc: {:.2}'.format(epoch+1, temp_test_loss, temp_test_acc)) # Plot loss over time epoch_seq = np.arange(1, epochs+1) plt.plot(epoch_seq, train_loss, 'k--', label='Train Set') plt.plot(epoch_seq, test_loss, 'r-', label='Test Set') plt.title('Softmax Loss') plt.xlabel('Epochs') plt.ylabel('Softmax Loss') plt.legend(loc='upper left') plt.show() # Plot accuracy over time plt.plot(epoch_seq, train_accuracy, 'k--', label='Train Set') plt.plot(epoch_seq, test_accuracy, 'r-', label='Test Set') plt.title('Test Accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend(loc='upper left') plt.show()

Vocabulary Size: 1124

80-20 Train Test split: 4459 -- 1115

C:\Users\Diane\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py

基于TensorFlow的循环神经网络(RNN)的更多相关文章

  1. 循环神经网络(RNN, Recurrent Neural Networks)介绍(转载)

    循环神经网络(RNN, Recurrent Neural Networks)介绍    这篇文章很多内容是参考:http://www.wildml.com/2015/09/recurrent-neur ...

  2. 通过keras例子理解LSTM 循环神经网络(RNN)

    博文的翻译和实践: Understanding Stateful LSTM Recurrent Neural Networks in Python with Keras 正文 一个强大而流行的循环神经 ...

  3. 深度学习之循环神经网络RNN概述,双向LSTM实现字符识别

    深度学习之循环神经网络RNN概述,双向LSTM实现字符识别 2. RNN概述 Recurrent Neural Network - 循环神经网络,最早出现在20世纪80年代,主要是用于时序数据的预测和 ...

  4. 循环神经网络(RNN, Recurrent Neural Networks)介绍

    原文地址: http://blog.csdn.net/heyongluoyao8/article/details/48636251# 循环神经网络(RNN, Recurrent Neural Netw ...

  5. 用纯Python实现循环神经网络RNN向前传播过程(吴恩达DeepLearning.ai作业)

    Google TensorFlow程序员点赞的文章!   前言 目录: - 向量表示以及它的维度 - rnn cell - rnn 向前传播 重点关注: - 如何把数据向量化的,它们的维度是怎么来的 ...

  6. 循环神经网络RNN及LSTM

    一.循环神经网络RNN RNN综述 https://juejin.im/entry/5b97e36cf265da0aa81be239 RNN中为什么要采用tanh而不是ReLu作为激活函数?  htt ...

  7. 循环神经网络RNN模型和长短时记忆系统LSTM

    传统DNN或者CNN无法对时间序列上的变化进行建模,即当前的预测只跟当前的输入样本相关,无法建立在时间或者先后顺序上出现在当前样本之前或者之后的样本之间的联系.实际的很多场景中,样本出现的时间顺序非常 ...

  8. 从网络架构方面简析循环神经网络RNN

    一.前言 1.1 诞生原因 在普通的前馈神经网络(如多层感知机MLP,卷积神经网络CNN)中,每次的输入都是独立的,即网络的输出依赖且仅依赖于当前输入,与过去一段时间内网络的输出无关.但是在现实生活中 ...

  9. 通俗易懂--循环神经网络(RNN)的网络结构!(TensorFlow实现)

    1. 什么是RNN 循环神经网络(Recurrent Neural Network, RNN)是一类以序列(sequence)数据为输入,在序列的演进方向进行递归(recursion)且所有节点(循环 ...

随机推荐

  1. Java面试不得不知的程序(二)

    [程序1]   题目:古典问题:有一对兔子,从出生后第3个月起每个月都生一对兔子,小兔子长到第三个月后每个月又生一对兔子,假如兔子都不死,问每个月的兔子总数为多少? 斐波那契数列:前面相邻两项之和,构 ...

  2. Javascript和HTML5的关系

    HTML5是一种新的技术,就目前而言,我们所知的HTML5都是一些标签,但是有了JS之后,这些标签深层的扩展功能才得以实现.       比如video标签,我们对其理解为一个简单的标签,但实际上,v ...

  3. HDU 2047 EOF牛肉串

    水到不想整理,线性DP #include <algorithm> #include <iostream> #include <cstring> #include & ...

  4. linux数据库copy方法

    相信大多数程序员都会遇到数据库copy的问题,下面就总结几种常见的方法,针对有mysql基础的同学参考 方法一:利用sqlyog的copy database的功能,如图 这种最简单,速度比较慢: 方法 ...

  5. 【JS】window.print打印指定内容

    有时候网页用到打印但是不想打印所有内容,就需要只打印指定内容,下面简单演示下如何打印指定内容 1.在需要打印的指定内容的头部前面加“<!--startprint-->”,在尾部后面加上“& ...

  6. thinkPHP5.0 save和saveAll,新增和更新的问题

    今天遇到一个问题,在模型中使用save保存数据之后,使用saveAll继续新增数据,结果报 缺少更新条件,网上搜了下发现一篇文章https://www.jianshu.com/p/1848f61de6 ...

  7. 43_2.VUE学习之--不使用组件computed计算属性超简单的实现美团购物车原理

    <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...

  8. 【转载】java 客户端链接不上redis解决方案 (jedis)

    本文出自:http://blog.csdn.net/lulidaitian/article/details/51946169 出现问题描述: 1.Could not get a resource fr ...

  9. 十一、mysql老是停止运行该怎么解决

    mysql老是停止运行该怎么解决 你可能还会遇到无法启动mysql的错误 解决方法如下:      

  10. 十四、pymysql模块

    一.安装的两种方法 第一种 #安装 pip3 install pymysql 第二种 二.链接,执行sql,关闭(游标) import pymysql user= input('用户名:>> ...