cs224d 作业 problem set2 (三) 用RNNLM模型实现Language Model,来预测下一个单词的出现
今天将的还是cs224d 的problem set2 的第三部分习题,
原来国外大学的系统难度真的如此之大,相比之下还是默默地再天朝继续搬砖吧
下面讲述一下RNN语言建模的数学公式:
给出一串连续的词x1,x2...xt关于预测其后面紧跟的词xt+1的建模方式是:
vj是词库中的某个词。实现一个循环神经网络,此网络利用隐层中的反馈信息对"历史记录"x1,x2...xt进行建模:
$h^{(0)}=h_{0}\epsilon R^{D_{h}}$是隐藏层的初始化向量
$x^{(t)}L$是以$x^{(t)}$one-hot行向量与嵌入矩阵L的乘积
这个one-hot行向量就是当前处理词汇的索引
是词嵌入矩阵,
$L$是词嵌入矩阵
$I$是输入词表征矩阵
$H$是隐藏转换矩阵
$U$是输出词表征矩阵
$b_{1}$ $b_{2}$是偏置值
$d$是词嵌入的维数
|V|代表词库的规模
$D_{h}$是隐层的维数
输出向量
是面向整个词库的概率分布,我们需要最优化交叉熵(非正则化的)的损失率:
使用困惑度来评估语言模型的性能,其定义形式如下:
梯度:
该模型中各个变量进行最优化迭代的时候的梯度如下所示:
初始化所有的上面这些需要训练的参数的值
然后通过对每一个词进行训练,安装上述公司求出每个参数的导数值
然后使用梯度下降方法对其进行更新
将新得到的参数代入到模型中,如果损失的值小于初始设定的值则停止迭代,否则继续进行迭代
下面是一张RNNLM的结构图
上面这张是第二层RNN节点的结构图
上面这张是在RNN的变量上面应用Dropout的结构,降低模型过拟合的误差,第一层RNN的dropout结构
上面这张是第一层RNN的结构图
(注意前方高能,一大批天书即将来袭)
'''
Created on 2017年9月26日 @author: weizhen
'''
import getpass
import sys
import time
import numpy as np
from copy import deepcopy
from utils import calculate_perplexity, get_ptb_dataset, Vocab
from utils import ptb_iterator, sample
import tensorflow as tf
from model import LanguageModel
from tensorflow.contrib.legacy_seq2seq.python.ops.seq2seq import sequence_loss class Config(object):
"""储存超参数和数据信息"""
batch_size = 64
embed_size = 50
hidden_size = 100
num_steps = 10
max_epochs = 16
early_stopping = 2
dropout = 0.9
lr = 0.001 class RNNLM_Model(LanguageModel):
def load_data(self, debug=False):
"""加载词向量并且训练 train/dev/test 数据"""
self.vocab = Vocab()
self.vocab.construct(get_ptb_dataset('train'))
self.encoded_train = np.array([self.vocab.encode(word) for word in get_ptb_dataset('train')], dtype=np.int32)
self.encoded_valid = np.array([self.vocab.encode(word) for word in get_ptb_dataset('valid')], dtype=np.int32)
self.encoded_test = np.array([self.vocab.encode(word) for word in get_ptb_dataset('test')])
if debug:
num_debug = 1024
self.encoded_train = self.encoded_train[:num_debug]
self.encoded_valid = self.encoded_valid[:num_debug]
self.encoded_test = self.encoded_test[:num_debug] def add_placeholders(self):
"""生成placeholder 变量来表示输入的 tensors
这些placeholder 被用来在模型的其他地方被填充
并且在训练的过程中会被填充
input_placeholder:Input placeholder shape (None,num_steps),type tf.int32
labels_placeholder:label placeholder shape (None,num_steps) type tf.float32
dropout_placeholder:dropput value placeholder (scalar), type tf.float32
"""
self.input_placeholder = tf.placeholder(tf.int32, shape=[None, self.config.num_steps], name='Input')
self.labels_placeholder = tf.placeholder(tf.int32, shape=[None, self.config.num_steps], name='Target')
self.dropout_placeholder = tf.placeholder(tf.float32, name='Dropout') def add_embedding(self):
"""添加词嵌入层
Hint : 这一层应该用input_placeholder 来索引词嵌入
Hint : 你或许能发现tf.nn.embedding_lookup 是有用的
Hint : 你或许能发现tf.split , tf.squeeze 是有用的在构造tensor 的输入的时候
Hint : 下面是你需要创建的变量的维度
L:(len(self.vocab),embed_size)
Returns:
inputs:一个训练次数的列表,每一个元素应该是
一个张量 大小是 (batch_size,embed_size)
tf.split(dimension,num_split,input)
dimension表示输入张量的哪一个维度,
如果是0就表示对第0维度进行切割,
num_split就是切割的数量,
如果是2就表示输入张量被切成2份,
每一份是一个列表
tf.squeeze(input,squeeze_dims=None,name=None)
从tensor中删除所有大小是1的维度
example: t is a tensor of shape [1,2,1,3,1,1]
shape(squeeze(t))==>[2,3]
t is a tensor of shape [1,2,1,3,1,1]
shape(squeeze(t,[2,4]))==>[1,2,3,1]
tf.nn.embedding_lookup 将词的索引映射到词的向量
"""
with tf.device('/cpu:0'):
embedding = tf.get_variable('Embedding', [len(self.vocab), self.config.embed_size], trainable=True)
inputs = tf.nn.embedding_lookup(embedding, self.input_placeholder)
inputs = [tf.squeeze(x, [1]) for x in tf.split(inputs, self.config.num_steps, 1)]
return inputs def add_projection(self, rnn_outputs):
"""添加一个投影层
投影层将隐藏层的表示变换到整个词向量上的分布式表示
Hint:下面是你需要去创建的维度
U(hidden_size,len(vocab))
b_2:(len(vocab),)
参数:
rnn_outputs:一个训练次数的列表,每一个元素应该是一个张量
大小是(batch_size,embed_size)
Returns:
outputs:一个长度的列表,每一个元素是一个张量(batch_size,len(vocab))
"""
with tf.variable_scope('Projection'):
U = tf.get_variable('Matrix', [self.config.hidden_size, len(self.vocab)])
proj_b = tf.get_variable('Bias', [len(self.vocab)])
outputs = [tf.matmul(o, U) + proj_b for o in rnn_outputs]
return outputs def add_loss_op(self, output):
"""将损失添加到目标函数上面
Hint:使用tensorflow.python.ops.seq2seq.sequence_loss 来实现序列损失
参数:
输出:一个张量 大小是 (None,self.vocab)
返回:
损失:一个0-d大小的张量
"""
all_ones = [tf.ones([self.config.batch_size * self.config.num_steps])]
cross_entropy = sequence_loss([output], [tf.reshape(self.labels_placeholder, [-1])], all_ones, len(self.vocab))
tf.add_to_collection('total_loss', cross_entropy)
loss = tf.add_n(tf.get_collection('total_loss'))
return loss def add_training_op(self, loss):
"""将目标损失添加到计算图上
创建一个优化器并且应用梯度下降到所有的训练变量上面
Hint:使用tf.train.AdamOptimizer 对于这个模型
使用optimizer.minimize() 会返回一个train_op的对象
参数:
loss: 损失张量,来自于cross_entropy_loss 交叉熵损失
返回:
train_op:训练的目标
"""
with tf.variable_scope("Optimizer") as scope:
train_op = tf.train.AdamOptimizer(self.config.lr).minimize(loss)
return train_op def __init__(self, config):
self.config = config
self.load_data(debug=False)
self.add_placeholders()
self.inputs = self.add_embedding()
self.rnn_outputs = self.add_model(self.inputs)
self.outputs = self.add_projection(self.rnn_outputs) # 我们想去检验下一个词预测得多好
# 我们把o转变成float64 位 因为如果不这样就会有数值问题
# sum(output of softmax) = 1.00000298179 并且不是 1
self.predictions = [tf.nn.softmax(tf.cast(o, 'float64')) for o in self.outputs]
# 将输出值转变成 len(vocab) 的大小
output = tf.reshape(tf.concat(self.outputs, 1), [-1, len(self.vocab)])
self.calculate_loss = self.add_loss_op(output)
self.train_step = self.add_training_op(self.calculate_loss) def add_model(self, inputs):
"""创建RNN LM 模型
在下面的实现里面你需要去实现RNN LM 模型的等式
Hint: 使用一个零向量 大小是 (batch_size,hidden_size) 作为初始的RNN的状态
Hint: 将最后RNN输出 作为实例变量
self.final_state
Hint : 确保将dropout应用到 输入和输出的 变量上面
Hint : 使用变量域 RNN 来定义 RNN变量
Hint : 表现一个明显的 for-loop 在输入上面
你可以使用scope.reuse_variable() 来确定权重
在每一次迭代都是相同的
确保不会在第一次循环的时候调用这个,因为没有变量会被初始化
Hint : 下面变量的不同的维度 , 你需要去创建的 H: (hidden_size,hidden_size)
I: (embed_size,hidden_size)
b_1:(hidden_size,)
Args:
inputs:一个记录num_steps的列表,里边的每一个元素应该是一个张量
大小是(batch_size,embed_size)的大小
Returns:返回
outputs:一个记录num_steps的列表,里面每一个元素应该是一个张量
大小是(batch_size,hidden_size)
"""
with tf.variable_scope('InputDropout'):
inputs = [tf.nn.dropout(x, self.dropout_placeholder) for x in inputs] with tf.variable_scope('RNN') as scope:
self.initial_state = tf.zeros([self.config.batch_size, self.config.hidden_size])
state = self.initial_state
rnn_outputs = []
for tstep, current_input in enumerate(inputs):
if tstep > 0:
scope.reuse_variables()
RNN_H = tf.get_variable('HMatrix', [self.config.hidden_size, self.config.hidden_size])
RNN_I = tf.get_variable('IMatrix', [self.config.embed_size, self.config.hidden_size])
RNN_b = tf.get_variable('B', [self.config.hidden_size])
state = tf.nn.sigmoid(tf.matmul(state, RNN_H) + tf.matmul(current_input, RNN_I) + RNN_b)
rnn_outputs.append(state)
self.final_state = rnn_outputs[-1] with tf.variable_scope('RNNDropout'):
rnn_outputs = [tf.nn.dropout(x, self.dropout_placeholder) for x in rnn_outputs]
return rnn_outputs def run_epoch(self, session, data, train_op=None, verbose=10):
config = self.config
dp = config.dropout
if not train_op:
train_op = tf.no_op()
dp = 1
total_steps = sum(1 for x in ptb_iterator(data, config.batch_size, config.num_steps))
total_loss = []
state = self.initial_state.eval()
for step, (x, y) in enumerate(ptb_iterator(data, config.batch_size, config.num_steps)):
# 我们需要通过初始状态,并且从最终状态中抽取数据来进行填充
# RNN 合适的 历史
feed = {self.input_placeholder: x,
self.labels_placeholder: y,
self.initial_state: state,
self.dropout_placeholder: dp
}
loss, state, _ = session.run([self.calculate_loss, self.final_state, train_op], feed_dict=feed)
total_loss.append(loss)
if verbose and step % verbose == 0:
sys.stdout.write('\r{} / {} : pp = {} '.format(step, total_steps, np.exp(np.mean(total_loss))))
sys.stdout.flush()
if verbose:
sys.stdout.write('\r')
return np.exp(np.mean(total_loss)) def generate_text(session, model, config, starting_text='<eos>', stop_length=100, stop_tokens=None, temp=1.0):
"""从模型自动生成文字
Hint:创建一个feed-dictionary 并且使用sess.run()方法去执行这个模型
你会需要使用model.initial_state 作为一个键传递给feed_dict
Hint:得到model.final_state 和 model.predictions[-1].
在add_model()方法中设置model.final_state 。
model.predictions 是在 __init__方法中设置的
Hint:在模型的训练中存储输出的参数值,和预测的y_pred的值
参数:
Args:
session : tf.Session() object
model : Object of type RNNLM Model
config : A Config() object
starting_text:Initial text passed to model
Returns:
output : List of word idxs
"""
state = model.initial_state.eval()
# Imagine tokens as a batch size of one, length of len(tokens[0])
tokens = [model.vocab.encode(word) for word in starting_text.split()]
for i in range(stop_length):
feed = {model.input_placeholder: [tokens[-1:]],
model.initial_state: state,
model.dropout_placeholder: 1}
state, y_pred = session.run([model.final_state, model.predictions[-1]], feed_dict=feed)
next_word_idx = sample(y_pred[0], temperature=temp)
tokens.append(next_word_idx)
if stop_tokens and model.vocab.decode(tokens[-1]) in stop_tokens:
break
output = [model.vocab.decode(word_idx) for word_idx in tokens]
return output def generate_sentence(session, model, config, *args, **kwargs):
"""方便从模型来生成句子"""
return generate_text(session, model, config, *args, stop_tokens=['<eos>'], **kwargs) def test_RNNLM():
config = Config()
gen_config = deepcopy(config)
gen_config.batch_size = gen_config.num_steps = 1 # 创建训练模型,并且生成模型
with tf.variable_scope('RNNLM',reuse=None) as scope:
model = RNNLM_Model(config)
# 这个指示gen_model来重新使用相同的变量作为以上的模型
scope.reuse_variables()
gen_model = RNNLM_Model(gen_config) init = tf.global_variables_initializer()
saver = tf.train.Saver() with tf.Session() as session:
best_val_pp = float('inf')
best_val_epoch = 0
session.run(init)
for epoch in range(config.max_epochs):
print('Epoch {0}'.format(epoch))
start = time.time() train_pp = model.run_epoch(session,
model.encoded_train,
train_op=model.train_step)
valid_pp = model.run_epoch(session, model.encoded_valid)
print('Training perplexity: {0}'.format(train_pp))
print('Validation perplexity:{0}'.format(valid_pp))
if valid_pp < best_val_pp:
best_val_pp = valid_pp
best_val_epoch = epoch
saver.save(session, './ptb_rnnlm.weights')
if epoch - best_val_epoch > config.early_stopping:
break
print('Total time : {0}'.format(time.time() - start)) saver.restore(session, 'ptb_rnnlm.weights')
test_pp = model.run_epoch(session, model.encoded_test)
print('=-=' * 5)
print('Test perplexity: {0} '.format(test_pp))
print('=-=' * 5)
starting_text = 'in palo alto'
while starting_text:
print(' '.join(generate_sentence(session, gen_model, gen_config, starting_text=starting_text, temp=1.0)))
#starting_text = raw_input('>') if __name__ == "__main__":
test_RNNLM()
(其实也不算是天书啦,比高数简单多啦,比数学分析那是简单了好几十万倍了呀)
下面是训练的Log
1380 / 1452 : pp = 266.20892333984375
1390 / 1452 : pp = 265.94439697265625
1400 / 1452 : pp = 265.66845703125
1410 / 1452 : pp = 265.5393981933594
1420 / 1452 : pp = 265.32489013671875
1430 / 1452 : pp = 265.2019348144531
1440 / 1452 : pp = 265.13720703125
1450 / 1452 : pp = 264.954833984375 0 / 115 : pp = 296.9217224121094
10 / 115 : pp = 282.02130126953125
20 / 115 : pp = 279.76800537109375
30 / 115 : pp = 276.4101257324219
40 / 115 : pp = 276.2939147949219
50 / 115 : pp = 270.73565673828125
60 / 115 : pp = 269.88134765625
70 / 115 : pp = 266.8675231933594
80 / 115 : pp = 263.6731872558594
90 / 115 : pp = 260.8569030761719
100 / 115 : pp = 256.3356628417969
110 / 115 : pp = 255.1026611328125
Training perplexity: 264.9092102050781
Validation perplexity:254.84902954101562
Total time : 41.65332388877869
Epoch 3 0 / 1452 : pp = 327.0847473144531
10 / 1452 : pp = 273.9620056152344
20 / 1452 : pp = 270.22943115234375
30 / 1452 : pp = 263.5213317871094
40 / 1452 : pp = 264.0644836425781
50 / 1452 : pp = 258.6029968261719
60 / 1452 : pp = 257.04290771484375
70 / 1452 : pp = 257.59161376953125
80 / 1452 : pp = 256.7600402832031
90 / 1452 : pp = 254.5120391845703
100 / 1452 : pp = 252.44725036621094
110 / 1452 : pp = 250.13954162597656
120 / 1452 : pp = 249.91647338867188
130 / 1452 : pp = 249.50460815429688
140 / 1452 : pp = 247.67440795898438
150 / 1452 : pp = 247.19090270996094
160 / 1452 : pp = 247.8919219970703
170 / 1452 : pp = 247.54322814941406
180 / 1452 : pp = 246.17623901367188
190 / 1452 : pp = 245.78330993652344
200 / 1452 : pp = 246.80552673339844
210 / 1452 : pp = 246.3059844970703
220 / 1452 : pp = 246.19021606445312
230 / 1452 : pp = 246.70140075683594
240 / 1452 : pp = 246.3099822998047
250 / 1452 : pp = 245.1745147705078
260 / 1452 : pp = 244.17384338378906
270 / 1452 : pp = 242.57363891601562
280 / 1452 : pp = 242.8500213623047
290 / 1452 : pp = 243.0492706298828
300 / 1452 : pp = 243.1466522216797
310 / 1452 : pp = 242.89044189453125
320 / 1452 : pp = 243.08045959472656
330 / 1452 : pp = 243.32235717773438
340 / 1452 : pp = 242.34715270996094
350 / 1452 : pp = 242.80972290039062
360 / 1452 : pp = 242.5345458984375
370 / 1452 : pp = 242.0083465576172
380 / 1452 : pp = 241.22708129882812
390 / 1452 : pp = 241.24398803710938
400 / 1452 : pp = 240.63473510742188
410 / 1452 : pp = 240.94094848632812
420 / 1452 : pp = 241.19717407226562
430 / 1452 : pp = 240.8896026611328
440 / 1452 : pp = 240.7772979736328
450 / 1452 : pp = 240.45913696289062
460 / 1452 : pp = 240.06674194335938
470 / 1452 : pp = 239.42198181152344
480 / 1452 : pp = 238.39271545410156
490 / 1452 : pp = 238.0517120361328
500 / 1452 : pp = 237.31752014160156
510 / 1452 : pp = 237.1197967529297
520 / 1452 : pp = 236.64865112304688
530 / 1452 : pp = 236.004638671875
540 / 1452 : pp = 235.192626953125
550 / 1452 : pp = 234.6700439453125
560 / 1452 : pp = 234.1914825439453
570 / 1452 : pp = 233.80899047851562
580 / 1452 : pp = 233.3753662109375
590 / 1452 : pp = 232.8699188232422
600 / 1452 : pp = 232.2629852294922
610 / 1452 : pp = 231.8668212890625
620 / 1452 : pp = 231.478515625
630 / 1452 : pp = 231.0444793701172
640 / 1452 : pp = 231.2737579345703
650 / 1452 : pp = 231.28114318847656
660 / 1452 : pp = 231.4324951171875
670 / 1452 : pp = 231.48513793945312
680 / 1452 : pp = 231.45932006835938
690 / 1452 : pp = 231.17738342285156
700 / 1452 : pp = 231.00570678710938
710 / 1452 : pp = 231.03810119628906
720 / 1452 : pp = 230.96131896972656
730 / 1452 : pp = 230.91110229492188
740 / 1452 : pp = 231.13539123535156
750 / 1452 : pp = 231.04393005371094
760 / 1452 : pp = 231.03489685058594
770 / 1452 : pp = 231.19744873046875
780 / 1452 : pp = 231.26625061035156
790 / 1452 : pp = 231.38714599609375
800 / 1452 : pp = 231.24441528320312
810 / 1452 : pp = 231.16824340820312
820 / 1452 : pp = 231.11831665039062
830 / 1452 : pp = 231.34886169433594
840 / 1452 : pp = 231.221923828125
850 / 1452 : pp = 231.2562255859375
860 / 1452 : pp = 231.26492309570312
870 / 1452 : pp = 231.1961212158203
880 / 1452 : pp = 231.30506896972656
890 / 1452 : pp = 231.24728393554688
900 / 1452 : pp = 231.15744018554688
910 / 1452 : pp = 231.20175170898438
920 / 1452 : pp = 231.25534057617188
930 / 1452 : pp = 231.09461975097656
940 / 1452 : pp = 231.12612915039062
950 / 1452 : pp = 231.0475616455078
960 / 1452 : pp = 230.86056518554688
970 / 1452 : pp = 230.80377197265625
980 / 1452 : pp = 230.4598846435547
990 / 1452 : pp = 230.24559020996094
1000 / 1452 : pp = 229.91030883789062
1010 / 1452 : pp = 229.9349822998047
1020 / 1452 : pp = 230.01470947265625
1030 / 1452 : pp = 229.8909149169922
1040 / 1452 : pp = 229.9403533935547
1050 / 1452 : pp = 229.84815979003906
1060 / 1452 : pp = 229.60377502441406
1070 / 1452 : pp = 229.74647521972656
1080 / 1452 : pp = 229.80410766601562
1090 / 1452 : pp = 229.78733825683594
1100 / 1452 : pp = 229.64549255371094
1110 / 1452 : pp = 229.26255798339844
1120 / 1452 : pp = 229.00262451171875
1130 / 1452 : pp = 228.6716766357422
1140 / 1452 : pp = 228.55067443847656
1150 / 1452 : pp = 228.61563110351562
1160 / 1452 : pp = 228.50958251953125
1170 / 1452 : pp = 228.3498992919922
1180 / 1452 : pp = 228.29786682128906
1190 / 1452 : pp = 228.33204650878906
1200 / 1452 : pp = 228.27369689941406
1210 / 1452 : pp = 228.11831665039062
1220 / 1452 : pp = 228.21775817871094
1230 / 1452 : pp = 228.3170166015625
1240 / 1452 : pp = 228.22134399414062
1250 / 1452 : pp = 228.3769073486328
1260 / 1452 : pp = 228.37527465820312
1270 / 1452 : pp = 228.33694458007812
1280 / 1452 : pp = 228.27108764648438
1290 / 1452 : pp = 228.1731414794922
1300 / 1452 : pp = 228.12200927734375
1310 / 1452 : pp = 228.10275268554688
1320 / 1452 : pp = 227.9289093017578
1330 / 1452 : pp = 227.77723693847656
1340 / 1452 : pp = 227.79623413085938
1350 / 1452 : pp = 227.7408447265625
1360 / 1452 : pp = 227.72586059570312
1370 / 1452 : pp = 227.49728393554688
1380 / 1452 : pp = 227.37940979003906
1390 / 1452 : pp = 227.20166015625
1400 / 1452 : pp = 227.018310546875
1410 / 1452 : pp = 226.95651245117188
1420 / 1452 : pp = 226.8065643310547
1430 / 1452 : pp = 226.7261199951172
1440 / 1452 : pp = 226.7193145751953
1450 / 1452 : pp = 226.61068725585938 0 / 115 : pp = 269.342041015625
10 / 115 : pp = 255.03016662597656
20 / 115 : pp = 253.8992919921875
30 / 115 : pp = 251.04025268554688
40 / 115 : pp = 250.51756286621094
50 / 115 : pp = 245.3595428466797
60 / 115 : pp = 244.4713897705078
70 / 115 : pp = 241.2674560546875
80 / 115 : pp = 238.3473663330078
90 / 115 : pp = 235.56423950195312
100 / 115 : pp = 231.2281036376953
110 / 115 : pp = 229.8423614501953
Training perplexity: 226.5760040283203
Validation perplexity:229.59939575195312
Total time : 42.202677726745605
Epoch 4 0 / 1452 : pp = 282.2423095703125
10 / 1452 : pp = 240.16258239746094
20 / 1452 : pp = 236.12203979492188
30 / 1452 : pp = 230.3953857421875
40 / 1452 : pp = 231.8789825439453
50 / 1452 : pp = 227.26612854003906
60 / 1452 : pp = 226.22061157226562
70 / 1452 : pp = 227.01885986328125
80 / 1452 : pp = 226.2459716796875
90 / 1452 : pp = 224.3211669921875
100 / 1452 : pp = 222.65615844726562
110 / 1452 : pp = 220.70326232910156
120 / 1452 : pp = 220.42288208007812
130 / 1452 : pp = 219.8100128173828
140 / 1452 : pp = 218.04432678222656
150 / 1452 : pp = 217.31639099121094
160 / 1452 : pp = 217.86349487304688
170 / 1452 : pp = 217.46597290039062
180 / 1452 : pp = 216.3349151611328
190 / 1452 : pp = 216.12240600585938
200 / 1452 : pp = 216.97842407226562
210 / 1452 : pp = 216.51014709472656
220 / 1452 : pp = 216.46751403808594
230 / 1452 : pp = 216.80126953125
240 / 1452 : pp = 216.45965576171875
250 / 1452 : pp = 215.5008544921875
260 / 1452 : pp = 214.62210083007812
270 / 1452 : pp = 213.29183959960938
280 / 1452 : pp = 213.5621337890625
290 / 1452 : pp = 213.80657958984375
300 / 1452 : pp = 213.8963165283203
310 / 1452 : pp = 213.60653686523438
320 / 1452 : pp = 213.85877990722656
330 / 1452 : pp = 214.07345581054688
340 / 1452 : pp = 213.25421142578125
350 / 1452 : pp = 213.68019104003906
360 / 1452 : pp = 213.41717529296875
370 / 1452 : pp = 213.04920959472656
380 / 1452 : pp = 212.39019775390625
390 / 1452 : pp = 212.4908905029297
400 / 1452 : pp = 212.01914978027344
410 / 1452 : pp = 212.36903381347656
420 / 1452 : pp = 212.6802520751953
430 / 1452 : pp = 212.42697143554688
440 / 1452 : pp = 212.42990112304688
450 / 1452 : pp = 212.14524841308594
460 / 1452 : pp = 211.7836151123047
470 / 1452 : pp = 211.17282104492188
480 / 1452 : pp = 210.27903747558594
490 / 1452 : pp = 209.95211791992188
500 / 1452 : pp = 209.28302001953125
510 / 1452 : pp = 209.1029815673828
520 / 1452 : pp = 208.73855590820312
530 / 1452 : pp = 208.19700622558594
540 / 1452 : pp = 207.4554443359375
550 / 1452 : pp = 207.0062255859375
560 / 1452 : pp = 206.59739685058594
570 / 1452 : pp = 206.27874755859375
580 / 1452 : pp = 205.87144470214844
590 / 1452 : pp = 205.43545532226562
600 / 1452 : pp = 204.90940856933594
610 / 1452 : pp = 204.5686798095703
620 / 1452 : pp = 204.22862243652344
630 / 1452 : pp = 203.8448028564453
640 / 1452 : pp = 204.06576538085938
650 / 1452 : pp = 204.0941925048828
660 / 1452 : pp = 204.22103881835938
670 / 1452 : pp = 204.289794921875
680 / 1452 : pp = 204.3115234375
690 / 1452 : pp = 204.10284423828125
700 / 1452 : pp = 203.99757385253906
710 / 1452 : pp = 204.04971313476562
720 / 1452 : pp = 204.03152465820312
730 / 1452 : pp = 203.99046325683594
740 / 1452 : pp = 204.19786071777344
750 / 1452 : pp = 204.1642608642578
760 / 1452 : pp = 204.19435119628906
770 / 1452 : pp = 204.37786865234375
780 / 1452 : pp = 204.4965057373047
790 / 1452 : pp = 204.6479034423828
800 / 1452 : pp = 204.56117248535156
810 / 1452 : pp = 204.52284240722656
820 / 1452 : pp = 204.50978088378906
830 / 1452 : pp = 204.7531280517578
840 / 1452 : pp = 204.64468383789062
850 / 1452 : pp = 204.71348571777344
860 / 1452 : pp = 204.7399444580078
870 / 1452 : pp = 204.69406127929688
880 / 1452 : pp = 204.7965850830078
890 / 1452 : pp = 204.7594757080078
900 / 1452 : pp = 204.71446228027344
910 / 1452 : pp = 204.7590789794922
920 / 1452 : pp = 204.85772705078125
930 / 1452 : pp = 204.7428741455078
940 / 1452 : pp = 204.8068389892578
950 / 1452 : pp = 204.75791931152344
960 / 1452 : pp = 204.63815307617188
970 / 1452 : pp = 204.60760498046875
980 / 1452 : pp = 204.34347534179688
990 / 1452 : pp = 204.151611328125
1000 / 1452 : pp = 203.8665771484375
1010 / 1452 : pp = 203.9164581298828
1020 / 1452 : pp = 204.0184783935547
1030 / 1452 : pp = 203.95166015625
1040 / 1452 : pp = 204.03045654296875
1050 / 1452 : pp = 203.95846557617188
1060 / 1452 : pp = 203.77114868164062
1070 / 1452 : pp = 203.93260192871094
1080 / 1452 : pp = 204.00048828125
1090 / 1452 : pp = 204.00233459472656
1100 / 1452 : pp = 203.8960418701172
1110 / 1452 : pp = 203.5987548828125
1120 / 1452 : pp = 203.38392639160156
1130 / 1452 : pp = 203.08872985839844
1140 / 1452 : pp = 203.01272583007812
1150 / 1452 : pp = 203.0865936279297
1160 / 1452 : pp = 203.02308654785156
1170 / 1452 : pp = 202.9125518798828
1180 / 1452 : pp = 202.9097442626953
1190 / 1452 : pp = 202.98252868652344
1200 / 1452 : pp = 202.95387268066406
1210 / 1452 : pp = 202.851318359375
1220 / 1452 : pp = 202.97671508789062
1230 / 1452 : pp = 203.1051025390625
1240 / 1452 : pp = 203.0526123046875
1250 / 1452 : pp = 203.21417236328125
1260 / 1452 : pp = 203.23617553710938
1270 / 1452 : pp = 203.22802734375
1280 / 1452 : pp = 203.20846557617188
1290 / 1452 : pp = 203.15362548828125
1300 / 1452 : pp = 203.14315795898438
1310 / 1452 : pp = 203.15264892578125
1320 / 1452 : pp = 203.02801513671875
1330 / 1452 : pp = 202.92977905273438
1340 / 1452 : pp = 202.95484924316406
1350 / 1452 : pp = 202.9335479736328
1360 / 1452 : pp = 202.955322265625
1370 / 1452 : pp = 202.7740478515625
1380 / 1452 : pp = 202.68569946289062
1390 / 1452 : pp = 202.55816650390625
1400 / 1452 : pp = 202.41651916503906
1410 / 1452 : pp = 202.38494873046875
1420 / 1452 : pp = 202.27593994140625
1430 / 1452 : pp = 202.21826171875
1440 / 1452 : pp = 202.23272705078125
1450 / 1452 : pp = 202.16099548339844 0 / 115 : pp = 253.23211669921875
10 / 115 : pp = 237.62506103515625
20 / 115 : pp = 237.60557556152344
30 / 115 : pp = 234.9273223876953
40 / 115 : pp = 234.30519104003906
50 / 115 : pp = 229.43960571289062
60 / 115 : pp = 228.6050567626953
70 / 115 : pp = 225.2646484375
80 / 115 : pp = 222.55935668945312
90 / 115 : pp = 219.83255004882812
100 / 115 : pp = 215.5491485595703
110 / 115 : pp = 214.07937622070312
Training perplexity: 202.1349639892578
Validation perplexity:213.85256958007812
Total time : 42.10724234580994
Epoch 5 0 / 1452 : pp = 255.92384338378906
10 / 1452 : pp = 219.5322265625
20 / 1452 : pp = 214.36212158203125
30 / 1452 : pp = 209.12620544433594
40 / 1452 : pp = 210.04193115234375
50 / 1452 : pp = 205.77398681640625
60 / 1452 : pp = 204.8201141357422
70 / 1452 : pp = 205.3955841064453
80 / 1452 : pp = 204.8386688232422
90 / 1452 : pp = 203.21194458007812
100 / 1452 : pp = 201.87643432617188
110 / 1452 : pp = 200.10122680664062
120 / 1452 : pp = 199.82012939453125
130 / 1452 : pp = 199.11192321777344
140 / 1452 : pp = 197.51919555664062
150 / 1452 : pp = 197.03567504882812
160 / 1452 : pp = 197.4231414794922
170 / 1452 : pp = 197.09571838378906
180 / 1452 : pp = 196.17665100097656
190 / 1452 : pp = 196.0064697265625
200 / 1452 : pp = 196.7347869873047
210 / 1452 : pp = 196.3063507080078
220 / 1452 : pp = 196.21388244628906
230 / 1452 : pp = 196.5252227783203
240 / 1452 : pp = 196.203125
250 / 1452 : pp = 195.3251953125
260 / 1452 : pp = 194.53335571289062
270 / 1452 : pp = 193.3546142578125
280 / 1452 : pp = 193.59420776367188
290 / 1452 : pp = 193.83297729492188
300 / 1452 : pp = 193.98489379882812
310 / 1452 : pp = 193.68414306640625
320 / 1452 : pp = 193.89065551757812
330 / 1452 : pp = 194.0518798828125
340 / 1452 : pp = 193.32888793945312
350 / 1452 : pp = 193.76219177246094
360 / 1452 : pp = 193.56106567382812
370 / 1452 : pp = 193.28179931640625
380 / 1452 : pp = 192.7037811279297
390 / 1452 : pp = 192.8145294189453
400 / 1452 : pp = 192.43325805664062
410 / 1452 : pp = 192.81527709960938
420 / 1452 : pp = 193.13760375976562
430 / 1452 : pp = 192.9148712158203
440 / 1452 : pp = 192.92526245117188
450 / 1452 : pp = 192.70083618164062
460 / 1452 : pp = 192.36647033691406
470 / 1452 : pp = 191.85394287109375
480 / 1452 : pp = 191.07244873046875
490 / 1452 : pp = 190.75401306152344
500 / 1452 : pp = 190.1843719482422
510 / 1452 : pp = 190.03334045410156
520 / 1452 : pp = 189.72938537597656
530 / 1452 : pp = 189.25889587402344
540 / 1452 : pp = 188.59315490722656
550 / 1452 : pp = 188.19313049316406
560 / 1452 : pp = 187.80621337890625
570 / 1452 : pp = 187.5229034423828
580 / 1452 : pp = 187.1091766357422
590 / 1452 : pp = 186.72592163085938
600 / 1452 : pp = 186.2238006591797
610 / 1452 : pp = 185.89695739746094
620 / 1452 : pp = 185.60989379882812
630 / 1452 : pp = 185.2689208984375
640 / 1452 : pp = 185.47567749023438
650 / 1452 : pp = 185.5127410888672
660 / 1452 : pp = 185.64627075195312
670 / 1452 : pp = 185.71311950683594
680 / 1452 : pp = 185.72569274902344
690 / 1452 : pp = 185.56459045410156
700 / 1452 : pp = 185.48681640625
710 / 1452 : pp = 185.5458221435547
720 / 1452 : pp = 185.5598907470703
730 / 1452 : pp = 185.5335235595703
740 / 1452 : pp = 185.73995971679688
750 / 1452 : pp = 185.744384765625
760 / 1452 : pp = 185.81268310546875
770 / 1452 : pp = 186.00088500976562
780 / 1452 : pp = 186.14443969726562
790 / 1452 : pp = 186.30764770507812
800 / 1452 : pp = 186.2595977783203
810 / 1452 : pp = 186.23028564453125
820 / 1452 : pp = 186.23997497558594
830 / 1452 : pp = 186.49057006835938
840 / 1452 : pp = 186.43331909179688
850 / 1452 : pp = 186.48887634277344
860 / 1452 : pp = 186.51502990722656
870 / 1452 : pp = 186.5167999267578
880 / 1452 : pp = 186.62400817871094
890 / 1452 : pp = 186.6103973388672
900 / 1452 : pp = 186.58111572265625
910 / 1452 : pp = 186.64126586914062
920 / 1452 : pp = 186.7366180419922
930 / 1452 : pp = 186.65719604492188
940 / 1452 : pp = 186.71755981445312
950 / 1452 : pp = 186.6977996826172
960 / 1452 : pp = 186.62774658203125
970 / 1452 : pp = 186.62115478515625
980 / 1452 : pp = 186.3773193359375
990 / 1452 : pp = 186.23109436035156
1000 / 1452 : pp = 185.99227905273438
1010 / 1452 : pp = 186.0488739013672
1020 / 1452 : pp = 186.1744384765625
1030 / 1452 : pp = 186.1162109375
1040 / 1452 : pp = 186.18899536132812
1050 / 1452 : pp = 186.1549072265625
1060 / 1452 : pp = 186.01419067382812
1070 / 1452 : pp = 186.17364501953125
1080 / 1452 : pp = 186.27061462402344
1090 / 1452 : pp = 186.28428649902344
1100 / 1452 : pp = 186.2150115966797
1110 / 1452 : pp = 185.95103454589844
1120 / 1452 : pp = 185.77423095703125
1130 / 1452 : pp = 185.5232696533203
1140 / 1452 : pp = 185.4607391357422
1150 / 1452 : pp = 185.56077575683594
1160 / 1452 : pp = 185.53343200683594
1170 / 1452 : pp = 185.46453857421875
1180 / 1452 : pp = 185.4741668701172
1190 / 1452 : pp = 185.5594482421875
1200 / 1452 : pp = 185.53785705566406
1210 / 1452 : pp = 185.4576416015625
1220 / 1452 : pp = 185.5943145751953
1230 / 1452 : pp = 185.7483673095703
1240 / 1452 : pp = 185.70762634277344
1250 / 1452 : pp = 185.8568115234375
1260 / 1452 : pp = 185.90635681152344
1270 / 1452 : pp = 185.8961639404297
1280 / 1452 : pp = 185.89199829101562
1290 / 1452 : pp = 185.85911560058594
1300 / 1452 : pp = 185.86097717285156
1310 / 1452 : pp = 185.88739013671875
1320 / 1452 : pp = 185.79248046875
1330 / 1452 : pp = 185.69700622558594
1340 / 1452 : pp = 185.7310028076172
1350 / 1452 : pp = 185.72613525390625
1360 / 1452 : pp = 185.76829528808594
1370 / 1452 : pp = 185.6322021484375
1380 / 1452 : pp = 185.56378173828125
1390 / 1452 : pp = 185.4654998779297
1400 / 1452 : pp = 185.35110473632812
1410 / 1452 : pp = 185.33917236328125
1420 / 1452 : pp = 185.2509002685547
1430 / 1452 : pp = 185.20436096191406
1440 / 1452 : pp = 185.2254638671875
1450 / 1452 : pp = 185.16542053222656 0 / 115 : pp = 242.26800537109375
10 / 115 : pp = 226.12258911132812
20 / 115 : pp = 226.4702606201172
30 / 115 : pp = 223.982666015625
40 / 115 : pp = 223.376953125
50 / 115 : pp = 218.65716552734375
60 / 115 : pp = 217.95306396484375
70 / 115 : pp = 214.5392303466797
80 / 115 : pp = 212.07525634765625
90 / 115 : pp = 209.40631103515625
100 / 115 : pp = 205.1455078125
110 / 115 : pp = 203.6289520263672
Training perplexity: 185.14476013183594
Validation perplexity:203.3822784423828
Total time : 42.47052240371704
Epoch 6 0 / 1452 : pp = 233.56707763671875
10 / 1452 : pp = 202.6468505859375
20 / 1452 : pp = 198.2734375
30 / 1452 : pp = 193.47442626953125
40 / 1452 : pp = 195.17147827148438
50 / 1452 : pp = 191.5596923828125
60 / 1452 : pp = 190.4825897216797
70 / 1452 : pp = 191.07681274414062
80 / 1452 : pp = 190.339599609375
90 / 1452 : pp = 188.98277282714844
100 / 1452 : pp = 187.74757385253906
110 / 1452 : pp = 186.10104370117188
120 / 1452 : pp = 185.7500457763672
130 / 1452 : pp = 184.90707397460938
140 / 1452 : pp = 183.340087890625
150 / 1452 : pp = 182.70840454101562
160 / 1452 : pp = 183.1043701171875
170 / 1452 : pp = 182.69776916503906
180 / 1452 : pp = 181.88400268554688
190 / 1452 : pp = 181.8062286376953
200 / 1452 : pp = 182.4969940185547
210 / 1452 : pp = 182.10572814941406
220 / 1452 : pp = 181.9981689453125
230 / 1452 : pp = 182.3802490234375
240 / 1452 : pp = 182.03636169433594
250 / 1452 : pp = 181.23712158203125
260 / 1452 : pp = 180.53726196289062
270 / 1452 : pp = 179.53567504882812
280 / 1452 : pp = 179.70208740234375
290 / 1452 : pp = 179.977783203125
300 / 1452 : pp = 180.16600036621094
310 / 1452 : pp = 179.87294006347656
320 / 1452 : pp = 180.11849975585938
330 / 1452 : pp = 180.31838989257812
340 / 1452 : pp = 179.56759643554688
350 / 1452 : pp = 179.97134399414062
360 / 1452 : pp = 179.80030822753906
370 / 1452 : pp = 179.52085876464844
380 / 1452 : pp = 178.98228454589844
390 / 1452 : pp = 179.0868682861328
400 / 1452 : pp = 178.74569702148438
410 / 1452 : pp = 179.1776580810547
420 / 1452 : pp = 179.5055389404297
430 / 1452 : pp = 179.3883056640625
440 / 1452 : pp = 179.42279052734375
450 / 1452 : pp = 179.2106475830078
460 / 1452 : pp = 178.85311889648438
470 / 1452 : pp = 178.33840942382812
480 / 1452 : pp = 177.60350036621094
490 / 1452 : pp = 177.30335998535156
500 / 1452 : pp = 176.72222900390625
510 / 1452 : pp = 176.6067352294922
520 / 1452 : pp = 176.33998107910156
530 / 1452 : pp = 175.93162536621094
540 / 1452 : pp = 175.30657958984375
550 / 1452 : pp = 174.9462432861328
560 / 1452 : pp = 174.5836639404297
570 / 1452 : pp = 174.31431579589844
580 / 1452 : pp = 173.92300415039062
590 / 1452 : pp = 173.55856323242188
600 / 1452 : pp = 173.08277893066406
610 / 1452 : pp = 172.75930786132812
620 / 1452 : pp = 172.53192138671875
630 / 1452 : pp = 172.20652770996094
640 / 1452 : pp = 172.37454223632812
650 / 1452 : pp = 172.39845275878906
660 / 1452 : pp = 172.52255249023438
670 / 1452 : pp = 172.60935974121094
680 / 1452 : pp = 172.6611328125
690 / 1452 : pp = 172.53118896484375
700 / 1452 : pp = 172.4709014892578
710 / 1452 : pp = 172.5406494140625
720 / 1452 : pp = 172.55447387695312
730 / 1452 : pp = 172.5330047607422
740 / 1452 : pp = 172.7061767578125
750 / 1452 : pp = 172.71054077148438
760 / 1452 : pp = 172.77743530273438
770 / 1452 : pp = 172.95481872558594
780 / 1452 : pp = 173.11265563964844
790 / 1452 : pp = 173.2832794189453
800 / 1452 : pp = 173.2537841796875
810 / 1452 : pp = 173.22164916992188
820 / 1452 : pp = 173.24148559570312
830 / 1452 : pp = 173.48228454589844
840 / 1452 : pp = 173.43753051757812
850 / 1452 : pp = 173.505615234375
860 / 1452 : pp = 173.5214080810547
870 / 1452 : pp = 173.5009002685547
880 / 1452 : pp = 173.6202392578125
890 / 1452 : pp = 173.622802734375
900 / 1452 : pp = 173.5987091064453
910 / 1452 : pp = 173.68316650390625
920 / 1452 : pp = 173.77330017089844
930 / 1452 : pp = 173.72018432617188
940 / 1452 : pp = 173.79351806640625
950 / 1452 : pp = 173.7653350830078
960 / 1452 : pp = 173.7102508544922
970 / 1452 : pp = 173.69766235351562
980 / 1452 : pp = 173.4836883544922
990 / 1452 : pp = 173.3550262451172
1000 / 1452 : pp = 173.14816284179688
1010 / 1452 : pp = 173.20777893066406
1020 / 1452 : pp = 173.3390655517578
1030 / 1452 : pp = 173.2884063720703
1040 / 1452 : pp = 173.38015747070312
1050 / 1452 : pp = 173.35592651367188
1060 / 1452 : pp = 173.2260284423828
1070 / 1452 : pp = 173.39321899414062
1080 / 1452 : pp = 173.4879913330078
1090 / 1452 : pp = 173.5231475830078
1100 / 1452 : pp = 173.47177124023438
1110 / 1452 : pp = 173.24453735351562
1120 / 1452 : pp = 173.09408569335938
1130 / 1452 : pp = 172.86627197265625
1140 / 1452 : pp = 172.8234100341797
1150 / 1452 : pp = 172.92843627929688
1160 / 1452 : pp = 172.90065002441406
1170 / 1452 : pp = 172.8550567626953
1180 / 1452 : pp = 172.8810272216797
1190 / 1452 : pp = 172.97312927246094
1200 / 1452 : pp = 172.9776611328125
1210 / 1452 : pp = 172.89413452148438
1220 / 1452 : pp = 173.0257568359375
1230 / 1452 : pp = 173.1847381591797
1240 / 1452 : pp = 173.1756591796875
1250 / 1452 : pp = 173.32138061523438
1260 / 1452 : pp = 173.37229919433594
1270 / 1452 : pp = 173.36891174316406
1280 / 1452 : pp = 173.36337280273438
1290 / 1452 : pp = 173.3444366455078
1300 / 1452 : pp = 173.36138916015625
1310 / 1452 : pp = 173.4015655517578
1320 / 1452 : pp = 173.31790161132812
1330 / 1452 : pp = 173.24710083007812
1340 / 1452 : pp = 173.27212524414062
1350 / 1452 : pp = 173.27674865722656
1360 / 1452 : pp = 173.32749938964844
1370 / 1452 : pp = 173.20472717285156
1380 / 1452 : pp = 173.14889526367188
1390 / 1452 : pp = 173.0755157470703
1400 / 1452 : pp = 172.9678497314453
1410 / 1452 : pp = 172.9612579345703
1420 / 1452 : pp = 172.8872833251953
1430 / 1452 : pp = 172.84805297851562
1440 / 1452 : pp = 172.87252807617188
1450 / 1452 : pp = 172.82505798339844 0 / 115 : pp = 236.35635375976562
10 / 115 : pp = 219.06166076660156
20 / 115 : pp = 219.7670440673828
30 / 115 : pp = 217.33587646484375
40 / 115 : pp = 216.6626739501953
50 / 115 : pp = 212.04734802246094
60 / 115 : pp = 211.42068481445312
70 / 115 : pp = 207.9592742919922
80 / 115 : pp = 205.6216583251953
90 / 115 : pp = 202.93597412109375
100 / 115 : pp = 198.62583923339844
110 / 115 : pp = 196.97216796875
Training perplexity: 172.80404663085938
Validation perplexity:196.6871337890625
Total time : 41.52522921562195
Epoch 7 0 / 1452 : pp = 219.23231506347656
10 / 1452 : pp = 192.07225036621094
20 / 1452 : pp = 187.48464965820312
30 / 1452 : pp = 182.9149932861328
40 / 1452 : pp = 184.2945098876953
50 / 1452 : pp = 180.78492736816406
60 / 1452 : pp = 179.377197265625
70 / 1452 : pp = 180.0273895263672
80 / 1452 : pp = 179.2517547607422
90 / 1452 : pp = 177.77540588378906
100 / 1452 : pp = 176.6474151611328
110 / 1452 : pp = 174.84066772460938
120 / 1452 : pp = 174.46890258789062
130 / 1452 : pp = 173.64573669433594
140 / 1452 : pp = 172.17483520507812
150 / 1452 : pp = 171.57041931152344
160 / 1452 : pp = 171.92059326171875
170 / 1452 : pp = 171.5497283935547
180 / 1452 : pp = 170.77249145507812
190 / 1452 : pp = 170.72103881835938
200 / 1452 : pp = 171.336181640625
210 / 1452 : pp = 170.98524475097656
220 / 1452 : pp = 170.99771118164062
230 / 1452 : pp = 171.39918518066406
240 / 1452 : pp = 171.09925842285156
250 / 1452 : pp = 170.39962768554688
260 / 1452 : pp = 169.7328643798828
270 / 1452 : pp = 168.72225952148438
280 / 1452 : pp = 168.92552185058594
290 / 1452 : pp = 169.20147705078125
300 / 1452 : pp = 169.40338134765625
310 / 1452 : pp = 169.12057495117188
320 / 1452 : pp = 169.31236267089844
330 / 1452 : pp = 169.49945068359375
340 / 1452 : pp = 168.8396759033203
350 / 1452 : pp = 169.25917053222656
360 / 1452 : pp = 169.09388732910156
370 / 1452 : pp = 168.84323120117188
380 / 1452 : pp = 168.3832550048828
390 / 1452 : pp = 168.48275756835938
400 / 1452 : pp = 168.19972229003906
410 / 1452 : pp = 168.5838623046875
420 / 1452 : pp = 168.91119384765625
430 / 1452 : pp = 168.80836486816406
440 / 1452 : pp = 168.90264892578125
450 / 1452 : pp = 168.68589782714844
460 / 1452 : pp = 168.3704071044922
470 / 1452 : pp = 167.90394592285156
480 / 1452 : pp = 167.23373413085938
490 / 1452 : pp = 166.9560546875
500 / 1452 : pp = 166.43161010742188
510 / 1452 : pp = 166.320068359375
520 / 1452 : pp = 166.05902099609375
530 / 1452 : pp = 165.71714782714844
540 / 1452 : pp = 165.10398864746094
550 / 1452 : pp = 164.80430603027344
560 / 1452 : pp = 164.4687042236328
570 / 1452 : pp = 164.2272491455078
580 / 1452 : pp = 163.84312438964844
590 / 1452 : pp = 163.46035766601562
600 / 1452 : pp = 163.01559448242188
610 / 1452 : pp = 162.74134826660156
620 / 1452 : pp = 162.50267028808594
630 / 1452 : pp = 162.2018280029297
640 / 1452 : pp = 162.37130737304688
650 / 1452 : pp = 162.3895721435547
660 / 1452 : pp = 162.51351928710938
670 / 1452 : pp = 162.57684326171875
680 / 1452 : pp = 162.6346893310547
690 / 1452 : pp = 162.5135955810547
700 / 1452 : pp = 162.47052001953125
710 / 1452 : pp = 162.539794921875
720 / 1452 : pp = 162.55381774902344
730 / 1452 : pp = 162.5297088623047
740 / 1452 : pp = 162.71652221679688
750 / 1452 : pp = 162.740966796875
760 / 1452 : pp = 162.79754638671875
770 / 1452 : pp = 162.9949951171875
780 / 1452 : pp = 163.17868041992188
790 / 1452 : pp = 163.33055114746094
800 / 1452 : pp = 163.31591796875
810 / 1452 : pp = 163.2859344482422
820 / 1452 : pp = 163.2958984375
830 / 1452 : pp = 163.528564453125
840 / 1452 : pp = 163.47610473632812
850 / 1452 : pp = 163.5260772705078
860 / 1452 : pp = 163.55352783203125
870 / 1452 : pp = 163.55718994140625
880 / 1452 : pp = 163.67523193359375
890 / 1452 : pp = 163.6920166015625
900 / 1452 : pp = 163.67710876464844
910 / 1452 : pp = 163.7476806640625
920 / 1452 : pp = 163.84803771972656
930 / 1452 : pp = 163.8114013671875
940 / 1452 : pp = 163.86663818359375
950 / 1452 : pp = 163.83531188964844
960 / 1452 : pp = 163.79945373535156
970 / 1452 : pp = 163.80320739746094
980 / 1452 : pp = 163.5953369140625
990 / 1452 : pp = 163.48382568359375
1000 / 1452 : pp = 163.2642822265625
1010 / 1452 : pp = 163.32113647460938
1020 / 1452 : pp = 163.44204711914062
1030 / 1452 : pp = 163.40206909179688
1040 / 1452 : pp = 163.4915313720703
1050 / 1452 : pp = 163.47096252441406
1060 / 1452 : pp = 163.3601531982422
1070 / 1452 : pp = 163.5138397216797
1080 / 1452 : pp = 163.6189727783203
1090 / 1452 : pp = 163.6471405029297
1100 / 1452 : pp = 163.60406494140625
1110 / 1452 : pp = 163.40736389160156
1120 / 1452 : pp = 163.26841735839844
1130 / 1452 : pp = 163.0680694580078
1140 / 1452 : pp = 163.04591369628906
1150 / 1452 : pp = 163.15478515625
1160 / 1452 : pp = 163.1380615234375
1170 / 1452 : pp = 163.09303283691406
1180 / 1452 : pp = 163.14149475097656
1190 / 1452 : pp = 163.2374267578125
1200 / 1452 : pp = 163.2394561767578
1210 / 1452 : pp = 163.17835998535156
1220 / 1452 : pp = 163.32347106933594
1230 / 1452 : pp = 163.4639434814453
1240 / 1452 : pp = 163.4611358642578
1250 / 1452 : pp = 163.60687255859375
1260 / 1452 : pp = 163.67227172851562
1270 / 1452 : pp = 163.67515563964844
1280 / 1452 : pp = 163.6881103515625
1290 / 1452 : pp = 163.66648864746094
1300 / 1452 : pp = 163.69287109375
1310 / 1452 : pp = 163.7276153564453
1320 / 1452 : pp = 163.6551055908203
1330 / 1452 : pp = 163.58901977539062
1340 / 1452 : pp = 163.6205291748047
1350 / 1452 : pp = 163.63824462890625
1360 / 1452 : pp = 163.69334411621094
1370 / 1452 : pp = 163.5885467529297
1380 / 1452 : pp = 163.54049682617188
1390 / 1452 : pp = 163.4760284423828
1400 / 1452 : pp = 163.38897705078125
1410 / 1452 : pp = 163.3974609375
1420 / 1452 : pp = 163.35009765625
1430 / 1452 : pp = 163.32191467285156
1440 / 1452 : pp = 163.35220336914062
1450 / 1452 : pp = 163.3201904296875 0 / 115 : pp = 232.2108154296875
10 / 115 : pp = 214.35496520996094
20 / 115 : pp = 215.20510864257812
30 / 115 : pp = 212.82754516601562
40 / 115 : pp = 212.0598907470703
50 / 115 : pp = 207.5095672607422
60 / 115 : pp = 206.86976623535156
70 / 115 : pp = 203.36016845703125
80 / 115 : pp = 201.11538696289062
90 / 115 : pp = 198.52120971679688
100 / 115 : pp = 194.1772003173828
110 / 115 : pp = 192.41224670410156
Training perplexity: 163.29916381835938
Validation perplexity:192.09552001953125
Total time : 41.78096055984497
Epoch 8 0 / 1452 : pp = 201.77548217773438
10 / 1452 : pp = 180.4141082763672
20 / 1452 : pp = 176.41432189941406
30 / 1452 : pp = 172.7764434814453
40 / 1452 : pp = 174.69166564941406
50 / 1452 : pp = 171.2933807373047
60 / 1452 : pp = 170.08010864257812
70 / 1452 : pp = 170.6719512939453
80 / 1452 : pp = 170.07589721679688
90 / 1452 : pp = 168.7478485107422
100 / 1452 : pp = 167.57081604003906
110 / 1452 : pp = 166.06971740722656
120 / 1452 : pp = 165.73374938964844
130 / 1452 : pp = 164.80674743652344
140 / 1452 : pp = 163.32821655273438
150 / 1452 : pp = 162.6752471923828
160 / 1452 : pp = 163.02049255371094
170 / 1452 : pp = 162.64120483398438
180 / 1452 : pp = 161.95529174804688
190 / 1452 : pp = 161.91954040527344
200 / 1452 : pp = 162.5446014404297
210 / 1452 : pp = 162.2645721435547
220 / 1452 : pp = 162.3128662109375
230 / 1452 : pp = 162.65872192382812
240 / 1452 : pp = 162.40948486328125
250 / 1452 : pp = 161.75787353515625
260 / 1452 : pp = 161.15213012695312
270 / 1452 : pp = 160.22256469726562
280 / 1452 : pp = 160.3651123046875
290 / 1452 : pp = 160.63780212402344
300 / 1452 : pp = 160.80026245117188
310 / 1452 : pp = 160.54383850097656
320 / 1452 : pp = 160.7539520263672
330 / 1452 : pp = 160.94317626953125
340 / 1452 : pp = 160.3373565673828
350 / 1452 : pp = 160.71763610839844
360 / 1452 : pp = 160.60960388183594
370 / 1452 : pp = 160.37527465820312
380 / 1452 : pp = 159.92990112304688
390 / 1452 : pp = 160.0165557861328
400 / 1452 : pp = 159.75697326660156
410 / 1452 : pp = 160.15274047851562
420 / 1452 : pp = 160.48390197753906
430 / 1452 : pp = 160.4031982421875
440 / 1452 : pp = 160.4693603515625
450 / 1452 : pp = 160.28016662597656
460 / 1452 : pp = 159.94004821777344
470 / 1452 : pp = 159.48257446289062
480 / 1452 : pp = 158.87998962402344
490 / 1452 : pp = 158.59765625
500 / 1452 : pp = 158.10865783691406
510 / 1452 : pp = 157.96795654296875
520 / 1452 : pp = 157.7591552734375
530 / 1452 : pp = 157.42648315429688
540 / 1452 : pp = 156.85348510742188
550 / 1452 : pp = 156.5618438720703
560 / 1452 : pp = 156.24905395507812
570 / 1452 : pp = 155.9994354248047
580 / 1452 : pp = 155.612060546875
590 / 1452 : pp = 155.25830078125
600 / 1452 : pp = 154.8464813232422
610 / 1452 : pp = 154.5833282470703
620 / 1452 : pp = 154.38040161132812
630 / 1452 : pp = 154.0767364501953
640 / 1452 : pp = 154.2534637451172
650 / 1452 : pp = 154.25875854492188
660 / 1452 : pp = 154.35874938964844
670 / 1452 : pp = 154.4289093017578
680 / 1452 : pp = 154.51412963867188
690 / 1452 : pp = 154.41676330566406
700 / 1452 : pp = 154.37892150878906
710 / 1452 : pp = 154.4234619140625
720 / 1452 : pp = 154.4586639404297
730 / 1452 : pp = 154.4351806640625
740 / 1452 : pp = 154.6002197265625
750 / 1452 : pp = 154.65684509277344
760 / 1452 : pp = 154.73318481445312
770 / 1452 : pp = 154.92935180664062
780 / 1452 : pp = 155.1021728515625
790 / 1452 : pp = 155.24757385253906
800 / 1452 : pp = 155.223876953125
810 / 1452 : pp = 155.2095184326172
820 / 1452 : pp = 155.24009704589844
830 / 1452 : pp = 155.4519500732422
840 / 1452 : pp = 155.3947296142578
850 / 1452 : pp = 155.45306396484375
860 / 1452 : pp = 155.4661102294922
870 / 1452 : pp = 155.45765686035156
880 / 1452 : pp = 155.58758544921875
890 / 1452 : pp = 155.59373474121094
900 / 1452 : pp = 155.59254455566406
910 / 1452 : pp = 155.66854858398438
920 / 1452 : pp = 155.75942993164062
930 / 1452 : pp = 155.73350524902344
940 / 1452 : pp = 155.80740356445312
950 / 1452 : pp = 155.7733917236328
960 / 1452 : pp = 155.73565673828125
970 / 1452 : pp = 155.74404907226562
980 / 1452 : pp = 155.55902099609375
990 / 1452 : pp = 155.45675659179688
1000 / 1452 : pp = 155.2649688720703
1010 / 1452 : pp = 155.31332397460938
1020 / 1452 : pp = 155.44979858398438
1030 / 1452 : pp = 155.4137725830078
1040 / 1452 : pp = 155.49012756347656
1050 / 1452 : pp = 155.46054077148438
1060 / 1452 : pp = 155.3616943359375
1070 / 1452 : pp = 155.5286865234375
1080 / 1452 : pp = 155.63743591308594
1090 / 1452 : pp = 155.6842803955078
1100 / 1452 : pp = 155.65599060058594
1110 / 1452 : pp = 155.4827880859375
1120 / 1452 : pp = 155.35450744628906
1130 / 1452 : pp = 155.1777801513672
1140 / 1452 : pp = 155.15994262695312
1150 / 1452 : pp = 155.26193237304688
1160 / 1452 : pp = 155.26214599609375
1170 / 1452 : pp = 155.23231506347656
1180 / 1452 : pp = 155.29266357421875
1190 / 1452 : pp = 155.37680053710938
1200 / 1452 : pp = 155.3736114501953
1210 / 1452 : pp = 155.3380584716797
1220 / 1452 : pp = 155.474853515625
1230 / 1452 : pp = 155.62986755371094
1240 / 1452 : pp = 155.62831115722656
1250 / 1452 : pp = 155.77101135253906
1260 / 1452 : pp = 155.83445739746094
1270 / 1452 : pp = 155.845458984375
1280 / 1452 : pp = 155.8556365966797
1290 / 1452 : pp = 155.8556365966797
1300 / 1452 : pp = 155.8843994140625
1310 / 1452 : pp = 155.92417907714844
1320 / 1452 : pp = 155.8560791015625
1330 / 1452 : pp = 155.80636596679688
1340 / 1452 : pp = 155.84344482421875
1350 / 1452 : pp = 155.8706512451172
1360 / 1452 : pp = 155.9273681640625
1370 / 1452 : pp = 155.83140563964844
1380 / 1452 : pp = 155.7911376953125
1390 / 1452 : pp = 155.7401885986328
1400 / 1452 : pp = 155.6622314453125
1410 / 1452 : pp = 155.68531799316406
1420 / 1452 : pp = 155.64041137695312
1430 / 1452 : pp = 155.62216186523438
1440 / 1452 : pp = 155.6437530517578
1450 / 1452 : pp = 155.62757873535156 0 / 115 : pp = 228.70111083984375
10 / 115 : pp = 211.03330993652344
20 / 115 : pp = 212.24957275390625
30 / 115 : pp = 209.8839569091797
40 / 115 : pp = 209.11045837402344
50 / 115 : pp = 204.66351318359375
60 / 115 : pp = 204.03366088867188
70 / 115 : pp = 200.46681213378906
80 / 115 : pp = 198.24404907226562
90 / 115 : pp = 195.63223266601562
100 / 115 : pp = 191.18345642089844
110 / 115 : pp = 189.31134033203125
Training perplexity: 155.61154174804688
Validation perplexity:188.94537353515625
Total time : 42.13483738899231
Epoch 9 0 / 1452 : pp = 197.80628967285156
10 / 1452 : pp = 172.6316680908203
20 / 1452 : pp = 168.6739959716797
30 / 1452 : pp = 164.4781036376953
40 / 1452 : pp = 166.1627960205078
50 / 1452 : pp = 163.05197143554688
60 / 1452 : pp = 161.87924194335938
70 / 1452 : pp = 162.5297088623047
80 / 1452 : pp = 161.7450714111328
90 / 1452 : pp = 160.6148223876953
100 / 1452 : pp = 159.73289489746094
110 / 1452 : pp = 158.4092254638672
120 / 1452 : pp = 158.04653930664062
130 / 1452 : pp = 157.13563537597656
140 / 1452 : pp = 155.71798706054688
150 / 1452 : pp = 155.19161987304688
160 / 1452 : pp = 155.42718505859375
170 / 1452 : pp = 155.0531463623047
180 / 1452 : pp = 154.46897888183594
190 / 1452 : pp = 154.4127197265625
200 / 1452 : pp = 154.97154235839844
210 / 1452 : pp = 154.70169067382812
220 / 1452 : pp = 154.72816467285156
230 / 1452 : pp = 155.03799438476562
240 / 1452 : pp = 154.85601806640625
250 / 1452 : pp = 154.28016662597656
260 / 1452 : pp = 153.7699432373047
270 / 1452 : pp = 152.90948486328125
280 / 1452 : pp = 153.0459747314453
290 / 1452 : pp = 153.298095703125
300 / 1452 : pp = 153.45716857910156
310 / 1452 : pp = 153.22195434570312
320 / 1452 : pp = 153.41664123535156
330 / 1452 : pp = 153.66542053222656
340 / 1452 : pp = 153.06378173828125
350 / 1452 : pp = 153.43923950195312
360 / 1452 : pp = 153.31381225585938
370 / 1452 : pp = 153.13473510742188
380 / 1452 : pp = 152.75267028808594
390 / 1452 : pp = 152.85504150390625
400 / 1452 : pp = 152.62342834472656
410 / 1452 : pp = 153.03152465820312
420 / 1452 : pp = 153.39161682128906
430 / 1452 : pp = 153.30364990234375
440 / 1452 : pp = 153.37896728515625
450 / 1452 : pp = 153.18988037109375
460 / 1452 : pp = 152.88478088378906
470 / 1452 : pp = 152.4380340576172
480 / 1452 : pp = 151.86618041992188
490 / 1452 : pp = 151.5962371826172
500 / 1452 : pp = 151.11614990234375
510 / 1452 : pp = 150.99830627441406
520 / 1452 : pp = 150.8135986328125
530 / 1452 : pp = 150.500732421875
540 / 1452 : pp = 149.9623260498047
550 / 1452 : pp = 149.68028259277344
560 / 1452 : pp = 149.3885040283203
570 / 1452 : pp = 149.140380859375
580 / 1452 : pp = 148.76876831054688
590 / 1452 : pp = 148.43368530273438
600 / 1452 : pp = 148.02598571777344
610 / 1452 : pp = 147.7869110107422
620 / 1452 : pp = 147.59796142578125
630 / 1452 : pp = 147.30068969726562
640 / 1452 : pp = 147.45240783691406
650 / 1452 : pp = 147.4651336669922
660 / 1452 : pp = 147.5808563232422
670 / 1452 : pp = 147.65582275390625
680 / 1452 : pp = 147.7360382080078
690 / 1452 : pp = 147.63075256347656
700 / 1452 : pp = 147.6066131591797
710 / 1452 : pp = 147.7024383544922
720 / 1452 : pp = 147.7445526123047
730 / 1452 : pp = 147.72279357910156
740 / 1452 : pp = 147.87107849121094
750 / 1452 : pp = 147.91436767578125
760 / 1452 : pp = 147.9857635498047
770 / 1452 : pp = 148.18206787109375
780 / 1452 : pp = 148.3845672607422
790 / 1452 : pp = 148.5517120361328
800 / 1452 : pp = 148.54002380371094
810 / 1452 : pp = 148.51119995117188
820 / 1452 : pp = 148.5664520263672
830 / 1452 : pp = 148.7821044921875
840 / 1452 : pp = 148.72486877441406
850 / 1452 : pp = 148.77452087402344
860 / 1452 : pp = 148.80076599121094
870 / 1452 : pp = 148.79701232910156
880 / 1452 : pp = 148.9181671142578
890 / 1452 : pp = 148.94537353515625
900 / 1452 : pp = 148.9435272216797
910 / 1452 : pp = 149.02102661132812
920 / 1452 : pp = 149.1085968017578
930 / 1452 : pp = 149.06893920898438
940 / 1452 : pp = 149.1317138671875
950 / 1452 : pp = 149.1232452392578
960 / 1452 : pp = 149.10354614257812
970 / 1452 : pp = 149.11656188964844
980 / 1452 : pp = 148.94259643554688
990 / 1452 : pp = 148.8236846923828
1000 / 1452 : pp = 148.633056640625
1010 / 1452 : pp = 148.6830291748047
1020 / 1452 : pp = 148.8126220703125
1030 / 1452 : pp = 148.78089904785156
1040 / 1452 : pp = 148.8600311279297
1050 / 1452 : pp = 148.8486785888672
1060 / 1452 : pp = 148.7664337158203
1070 / 1452 : pp = 148.9337921142578
1080 / 1452 : pp = 149.04441833496094
1090 / 1452 : pp = 149.07284545898438
1100 / 1452 : pp = 149.03318786621094
1110 / 1452 : pp = 148.86428833007812
1120 / 1452 : pp = 148.7332305908203
1130 / 1452 : pp = 148.5670166015625
1140 / 1452 : pp = 148.54661560058594
1150 / 1452 : pp = 148.64219665527344
1160 / 1452 : pp = 148.6490020751953
1170 / 1452 : pp = 148.62420654296875
1180 / 1452 : pp = 148.67665100097656
1190 / 1452 : pp = 148.7633056640625
1200 / 1452 : pp = 148.7782745361328
1210 / 1452 : pp = 148.72500610351562
1220 / 1452 : pp = 148.87493896484375
1230 / 1452 : pp = 149.039794921875
1240 / 1452 : pp = 149.04000854492188
1250 / 1452 : pp = 149.17054748535156
1260 / 1452 : pp = 149.23863220214844
1270 / 1452 : pp = 149.2436065673828
1280 / 1452 : pp = 149.25086975097656
1290 / 1452 : pp = 149.24147033691406
1300 / 1452 : pp = 149.27413940429688
1310 / 1452 : pp = 149.32077026367188
1320 / 1452 : pp = 149.27301025390625
1330 / 1452 : pp = 149.23080444335938
1340 / 1452 : pp = 149.25791931152344
1350 / 1452 : pp = 149.2841033935547
1360 / 1452 : pp = 149.337158203125
1370 / 1452 : pp = 149.2467498779297
1380 / 1452 : pp = 149.21351623535156
1390 / 1452 : pp = 149.15403747558594
1400 / 1452 : pp = 149.0877685546875
1410 / 1452 : pp = 149.110595703125
1420 / 1452 : pp = 149.07241821289062
1430 / 1452 : pp = 149.05166625976562
1440 / 1452 : pp = 149.0776824951172
1450 / 1452 : pp = 149.06771850585938 0 / 115 : pp = 227.0559844970703
10 / 115 : pp = 208.7002410888672
20 / 115 : pp = 210.38775634765625
30 / 115 : pp = 207.9513397216797
40 / 115 : pp = 207.12994384765625
50 / 115 : pp = 202.70811462402344
60 / 115 : pp = 202.05787658691406
70 / 115 : pp = 198.3761444091797
80 / 115 : pp = 196.17637634277344
90 / 115 : pp = 193.5880126953125
100 / 115 : pp = 189.0758819580078
110 / 115 : pp = 187.07528686523438
Training perplexity: 149.0502471923828
Validation perplexity:186.6911163330078
Total time : 47.274805545806885
Epoch 10 0 / 1452 : pp = 181.8408203125
10 / 1452 : pp = 164.99664306640625
20 / 1452 : pp = 161.8847198486328
30 / 1452 : pp = 158.30064392089844
40 / 1452 : pp = 160.13914489746094
50 / 1452 : pp = 157.58743286132812
60 / 1452 : pp = 156.11871337890625
70 / 1452 : pp = 156.82948303222656
80 / 1452 : pp = 156.2889862060547
90 / 1452 : pp = 155.04833984375
100 / 1452 : pp = 154.09327697753906
110 / 1452 : pp = 152.5070343017578
120 / 1452 : pp = 152.20750427246094
130 / 1452 : pp = 151.3399200439453
140 / 1452 : pp = 149.90740966796875
150 / 1452 : pp = 149.345703125
160 / 1452 : pp = 149.59814453125
170 / 1452 : pp = 149.26539611816406
180 / 1452 : pp = 148.624267578125
190 / 1452 : pp = 148.58819580078125
200 / 1452 : pp = 149.09552001953125
210 / 1452 : pp = 148.8439178466797
220 / 1452 : pp = 148.86605834960938
230 / 1452 : pp = 149.1971435546875
240 / 1452 : pp = 148.96533203125
250 / 1452 : pp = 148.4253387451172
260 / 1452 : pp = 147.9200897216797
270 / 1452 : pp = 147.08816528320312
280 / 1452 : pp = 147.24366760253906
290 / 1452 : pp = 147.52182006835938
300 / 1452 : pp = 147.72222900390625
310 / 1452 : pp = 147.50486755371094
320 / 1452 : pp = 147.73892211914062
330 / 1452 : pp = 147.9404754638672
340 / 1452 : pp = 147.37803649902344
350 / 1452 : pp = 147.6969451904297
360 / 1452 : pp = 147.5704345703125
370 / 1452 : pp = 147.38674926757812
380 / 1452 : pp = 147.03970336914062
390 / 1452 : pp = 147.14231872558594
400 / 1452 : pp = 146.91656494140625
410 / 1452 : pp = 147.34059143066406
420 / 1452 : pp = 147.68496704101562
430 / 1452 : pp = 147.61195373535156
440 / 1452 : pp = 147.68405151367188
450 / 1452 : pp = 147.4711151123047
460 / 1452 : pp = 147.1927032470703
470 / 1452 : pp = 146.72970581054688
480 / 1452 : pp = 146.17173767089844
490 / 1452 : pp = 145.9028778076172
500 / 1452 : pp = 145.42721557617188
510 / 1452 : pp = 145.3111114501953
520 / 1452 : pp = 145.11460876464844
530 / 1452 : pp = 144.81488037109375
540 / 1452 : pp = 144.263916015625
550 / 1452 : pp = 143.997802734375
560 / 1452 : pp = 143.71766662597656
570 / 1452 : pp = 143.47451782226562
580 / 1452 : pp = 143.08474731445312
590 / 1452 : pp = 142.77920532226562
600 / 1452 : pp = 142.39573669433594
610 / 1452 : pp = 142.14906311035156
620 / 1452 : pp = 141.9574432373047
630 / 1452 : pp = 141.67369079589844
640 / 1452 : pp = 141.81556701660156
650 / 1452 : pp = 141.81759643554688
660 / 1452 : pp = 141.9339599609375
670 / 1452 : pp = 142.01248168945312
680 / 1452 : pp = 142.08773803710938
690 / 1452 : pp = 142.00328063964844
700 / 1452 : pp = 141.98086547851562
710 / 1452 : pp = 142.0632781982422
720 / 1452 : pp = 142.10372924804688
730 / 1452 : pp = 142.08055114746094
740 / 1452 : pp = 142.23619079589844
750 / 1452 : pp = 142.2660369873047
760 / 1452 : pp = 142.34678649902344
770 / 1452 : pp = 142.5257568359375
780 / 1452 : pp = 142.70025634765625
790 / 1452 : pp = 142.8614044189453
800 / 1452 : pp = 142.84573364257812
810 / 1452 : pp = 142.8250274658203
820 / 1452 : pp = 142.8540496826172
830 / 1452 : pp = 143.06053161621094
840 / 1452 : pp = 143.0423126220703
850 / 1452 : pp = 143.09634399414062
860 / 1452 : pp = 143.10487365722656
870 / 1452 : pp = 143.0884246826172
880 / 1452 : pp = 143.19387817382812
890 / 1452 : pp = 143.236083984375
900 / 1452 : pp = 143.23390197753906
910 / 1452 : pp = 143.29537963867188
920 / 1452 : pp = 143.3722686767578
930 / 1452 : pp = 143.33795166015625
940 / 1452 : pp = 143.40618896484375
950 / 1452 : pp = 143.3929901123047
960 / 1452 : pp = 143.3693389892578
970 / 1452 : pp = 143.39736938476562
980 / 1452 : pp = 143.2371063232422
990 / 1452 : pp = 143.13893127441406
1000 / 1452 : pp = 142.9658660888672
1010 / 1452 : pp = 143.01544189453125
1020 / 1452 : pp = 143.152587890625
1030 / 1452 : pp = 143.11334228515625
1040 / 1452 : pp = 143.19020080566406
1050 / 1452 : pp = 143.18234252929688
1060 / 1452 : pp = 143.092041015625
1070 / 1452 : pp = 143.24449157714844
1080 / 1452 : pp = 143.34828186035156
1090 / 1452 : pp = 143.38739013671875
1100 / 1452 : pp = 143.37432861328125
1110 / 1452 : pp = 143.20596313476562
1120 / 1452 : pp = 143.07969665527344
1130 / 1452 : pp = 142.92041015625
1140 / 1452 : pp = 142.90902709960938
1150 / 1452 : pp = 143.00732421875
1160 / 1452 : pp = 143.01182556152344
1170 / 1452 : pp = 142.9925994873047
1180 / 1452 : pp = 143.06080627441406
1190 / 1452 : pp = 143.14337158203125
1200 / 1452 : pp = 143.16644287109375
1210 / 1452 : pp = 143.1259002685547
1220 / 1452 : pp = 143.2671661376953
1230 / 1452 : pp = 143.4210968017578
1240 / 1452 : pp = 143.4327850341797
1250 / 1452 : pp = 143.5699920654297
1260 / 1452 : pp = 143.63771057128906
1270 / 1452 : pp = 143.65798950195312
1280 / 1452 : pp = 143.68251037597656
1290 / 1452 : pp = 143.68045043945312
1300 / 1452 : pp = 143.72293090820312
1310 / 1452 : pp = 143.77015686035156
1320 / 1452 : pp = 143.71910095214844
1330 / 1452 : pp = 143.68792724609375
1340 / 1452 : pp = 143.7241668701172
1350 / 1452 : pp = 143.7570037841797
1360 / 1452 : pp = 143.81829833984375
1370 / 1452 : pp = 143.7487030029297
1380 / 1452 : pp = 143.7196502685547
1390 / 1452 : pp = 143.67359924316406
1400 / 1452 : pp = 143.60592651367188
1410 / 1452 : pp = 143.62620544433594
1420 / 1452 : pp = 143.5905303955078
1430 / 1452 : pp = 143.55799865722656
1440 / 1452 : pp = 143.5891571044922
1450 / 1452 : pp = 143.5869598388672 0 / 115 : pp = 226.9864959716797
10 / 115 : pp = 207.8067169189453
20 / 115 : pp = 209.68667602539062
30 / 115 : pp = 207.1610565185547
40 / 115 : pp = 206.3247833251953
50 / 115 : pp = 201.77403259277344
60 / 115 : pp = 201.07098388671875
70 / 115 : pp = 197.33335876464844
80 / 115 : pp = 195.12513732910156
90 / 115 : pp = 192.5349578857422
100 / 115 : pp = 187.90072631835938
110 / 115 : pp = 185.81240844726562
Training perplexity: 143.57354736328125
Validation perplexity:185.40573120117188
Total time : 46.14846849441528
Epoch 11 0 / 1452 : pp = 181.93162536621094
10 / 1452 : pp = 159.94607543945312
20 / 1452 : pp = 156.83673095703125
30 / 1452 : pp = 153.75843811035156
40 / 1452 : pp = 155.18362426757812
50 / 1452 : pp = 152.39529418945312
60 / 1452 : pp = 151.18772888183594
70 / 1452 : pp = 151.9004364013672
80 / 1452 : pp = 151.30239868164062
90 / 1452 : pp = 150.1591033935547
100 / 1452 : pp = 149.18618774414062
110 / 1452 : pp = 147.72653198242188
120 / 1452 : pp = 147.4357452392578
130 / 1452 : pp = 146.41372680664062
140 / 1452 : pp = 145.0057373046875
150 / 1452 : pp = 144.39447021484375
160 / 1452 : pp = 144.5330047607422
170 / 1452 : pp = 144.23593139648438
180 / 1452 : pp = 143.63990783691406
190 / 1452 : pp = 143.63812255859375
200 / 1452 : pp = 144.1143798828125
210 / 1452 : pp = 143.88278198242188
220 / 1452 : pp = 143.92518615722656
230 / 1452 : pp = 144.24032592773438
240 / 1452 : pp = 143.94110107421875
250 / 1452 : pp = 143.3688507080078
260 / 1452 : pp = 142.8829345703125
270 / 1452 : pp = 142.11952209472656
280 / 1452 : pp = 142.19415283203125
290 / 1452 : pp = 142.51889038085938
300 / 1452 : pp = 142.70494079589844
310 / 1452 : pp = 142.51426696777344
320 / 1452 : pp = 142.70106506347656
330 / 1452 : pp = 142.88014221191406
340 / 1452 : pp = 142.3287353515625
350 / 1452 : pp = 142.6169891357422
360 / 1452 : pp = 142.51971435546875
370 / 1452 : pp = 142.33566284179688
380 / 1452 : pp = 142.04161071777344
390 / 1452 : pp = 142.13551330566406
400 / 1452 : pp = 141.9499969482422
410 / 1452 : pp = 142.3361358642578
420 / 1452 : pp = 142.64065551757812
430 / 1452 : pp = 142.5511016845703
440 / 1452 : pp = 142.6728973388672
450 / 1452 : pp = 142.47030639648438
460 / 1452 : pp = 142.1704864501953
470 / 1452 : pp = 141.73390197753906
480 / 1452 : pp = 141.23020935058594
490 / 1452 : pp = 140.9759521484375
500 / 1452 : pp = 140.51609802246094
510 / 1452 : pp = 140.40545654296875
520 / 1452 : pp = 140.1936492919922
530 / 1452 : pp = 139.8929443359375
540 / 1452 : pp = 139.3696746826172
550 / 1452 : pp = 139.13217163085938
560 / 1452 : pp = 138.85247802734375
570 / 1452 : pp = 138.6092987060547
580 / 1452 : pp = 138.2471160888672
590 / 1452 : pp = 137.9485626220703
600 / 1452 : pp = 137.57379150390625
610 / 1452 : pp = 137.31576538085938
620 / 1452 : pp = 137.14230346679688
630 / 1452 : pp = 136.87405395507812
640 / 1452 : pp = 137.02928161621094
650 / 1452 : pp = 137.0481719970703
660 / 1452 : pp = 137.1595001220703
670 / 1452 : pp = 137.21124267578125
680 / 1452 : pp = 137.2671356201172
690 / 1452 : pp = 137.19410705566406
700 / 1452 : pp = 137.1850128173828
710 / 1452 : pp = 137.26058959960938
720 / 1452 : pp = 137.30726623535156
730 / 1452 : pp = 137.28048706054688
740 / 1452 : pp = 137.4352569580078
750 / 1452 : pp = 137.4680938720703
760 / 1452 : pp = 137.5524139404297
770 / 1452 : pp = 137.73829650878906
780 / 1452 : pp = 137.90882873535156
790 / 1452 : pp = 138.05865478515625
800 / 1452 : pp = 138.0673370361328
810 / 1452 : pp = 138.03909301757812
820 / 1452 : pp = 138.084716796875
830 / 1452 : pp = 138.27989196777344
840 / 1452 : pp = 138.23545837402344
850 / 1452 : pp = 138.30343627929688
860 / 1452 : pp = 138.3339080810547
870 / 1452 : pp = 138.32835388183594
880 / 1452 : pp = 138.4450225830078
890 / 1452 : pp = 138.47157287597656
900 / 1452 : pp = 138.46304321289062
910 / 1452 : pp = 138.55618286132812
920 / 1452 : pp = 138.64512634277344
930 / 1452 : pp = 138.6160430908203
940 / 1452 : pp = 138.66932678222656
950 / 1452 : pp = 138.6573028564453
960 / 1452 : pp = 138.6463165283203
970 / 1452 : pp = 138.67059326171875
980 / 1452 : pp = 138.50999450683594
990 / 1452 : pp = 138.42430114746094
1000 / 1452 : pp = 138.25344848632812
1010 / 1452 : pp = 138.3004608154297
1020 / 1452 : pp = 138.4243621826172
1030 / 1452 : pp = 138.40713500976562
1040 / 1452 : pp = 138.47129821777344
1050 / 1452 : pp = 138.45928955078125
1060 / 1452 : pp = 138.3919677734375
1070 / 1452 : pp = 138.5287628173828
1080 / 1452 : pp = 138.62298583984375
1090 / 1452 : pp = 138.6699981689453
1100 / 1452 : pp = 138.64849853515625
1110 / 1452 : pp = 138.49191284179688
1120 / 1452 : pp = 138.37355041503906
1130 / 1452 : pp = 138.2216796875
1140 / 1452 : pp = 138.21534729003906
1150 / 1452 : pp = 138.30963134765625
1160 / 1452 : pp = 138.316162109375
1170 / 1452 : pp = 138.3023681640625
1180 / 1452 : pp = 138.36932373046875
1190 / 1452 : pp = 138.45960998535156
1200 / 1452 : pp = 138.4866180419922
1210 / 1452 : pp = 138.45730590820312
1220 / 1452 : pp = 138.60031127929688
1230 / 1452 : pp = 138.75485229492188
1240 / 1452 : pp = 138.7751007080078
1250 / 1452 : pp = 138.91221618652344
1260 / 1452 : pp = 138.9815216064453
1270 / 1452 : pp = 138.9919891357422
1280 / 1452 : pp = 139.0243377685547
1290 / 1452 : pp = 139.02725219726562
1300 / 1452 : pp = 139.0701446533203
1310 / 1452 : pp = 139.1090850830078
1320 / 1452 : pp = 139.06027221679688
1330 / 1452 : pp = 139.0338134765625
1340 / 1452 : pp = 139.06385803222656
1350 / 1452 : pp = 139.09608459472656
1360 / 1452 : pp = 139.1609649658203
1370 / 1452 : pp = 139.0869903564453
1380 / 1452 : pp = 139.0604705810547
1390 / 1452 : pp = 139.01670837402344
1400 / 1452 : pp = 138.94393920898438
1410 / 1452 : pp = 138.97323608398438
1420 / 1452 : pp = 138.9404296875
1430 / 1452 : pp = 138.90943908691406
1440 / 1452 : pp = 138.94268798828125
1450 / 1452 : pp = 138.93991088867188 0 / 115 : pp = 225.55990600585938
10 / 115 : pp = 207.0504608154297
20 / 115 : pp = 208.98306274414062
30 / 115 : pp = 206.28396606445312
40 / 115 : pp = 205.35386657714844
50 / 115 : pp = 200.7255401611328
60 / 115 : pp = 200.0526580810547
70 / 115 : pp = 196.33087158203125
80 / 115 : pp = 194.12110900878906
90 / 115 : pp = 191.52816772460938
100 / 115 : pp = 186.7974395751953
110 / 115 : pp = 184.59829711914062
Training perplexity: 138.9222869873047
Validation perplexity:184.18101501464844
Total time : 43.92928600311279
Epoch 12 0 / 1452 : pp = 173.0251007080078
10 / 1452 : pp = 152.98446655273438
20 / 1452 : pp = 150.43128967285156
30 / 1452 : pp = 147.5819854736328
40 / 1452 : pp = 149.4164276123047
50 / 1452 : pp = 146.70816040039062
60 / 1452 : pp = 145.557861328125
70 / 1452 : pp = 146.50473022460938
80 / 1452 : pp = 145.83200073242188
90 / 1452 : pp = 144.84402465820312
100 / 1452 : pp = 144.0390167236328
110 / 1452 : pp = 142.66514587402344
120 / 1452 : pp = 142.3549346923828
130 / 1452 : pp = 141.4630126953125
140 / 1452 : pp = 140.2266082763672
150 / 1452 : pp = 139.67518615722656
160 / 1452 : pp = 139.90414428710938
170 / 1452 : pp = 139.5490264892578
180 / 1452 : pp = 138.91969299316406
190 / 1452 : pp = 138.89234924316406
200 / 1452 : pp = 139.40908813476562
210 / 1452 : pp = 139.19068908691406
220 / 1452 : pp = 139.35513305664062
230 / 1452 : pp = 139.5464324951172
240 / 1452 : pp = 139.3047637939453
250 / 1452 : pp = 138.7708740234375
260 / 1452 : pp = 138.29188537597656
270 / 1452 : pp = 137.4787139892578
280 / 1452 : pp = 137.6367950439453
290 / 1452 : pp = 137.98513793945312
300 / 1452 : pp = 138.17819213867188
310 / 1452 : pp = 137.943359375
320 / 1452 : pp = 138.12060546875
330 / 1452 : pp = 138.29037475585938
340 / 1452 : pp = 137.77606201171875
350 / 1452 : pp = 138.06378173828125
360 / 1452 : pp = 137.99000549316406
370 / 1452 : pp = 137.81922912597656
380 / 1452 : pp = 137.52159118652344
390 / 1452 : pp = 137.61782836914062
400 / 1452 : pp = 137.4178924560547
410 / 1452 : pp = 137.82632446289062
420 / 1452 : pp = 138.17567443847656
430 / 1452 : pp = 138.11863708496094
440 / 1452 : pp = 138.215087890625
450 / 1452 : pp = 137.9976348876953
460 / 1452 : pp = 137.6929168701172
470 / 1452 : pp = 137.25416564941406
480 / 1452 : pp = 136.75140380859375
490 / 1452 : pp = 136.51712036132812
500 / 1452 : pp = 136.0896453857422
510 / 1452 : pp = 135.97048950195312
520 / 1452 : pp = 135.7760009765625
530 / 1452 : pp = 135.50389099121094
540 / 1452 : pp = 135.01437377929688
550 / 1452 : pp = 134.7666015625
560 / 1452 : pp = 134.48973083496094
570 / 1452 : pp = 134.22853088378906
580 / 1452 : pp = 133.88455200195312
590 / 1452 : pp = 133.5808868408203
600 / 1452 : pp = 133.22975158691406
610 / 1452 : pp = 132.99591064453125
620 / 1452 : pp = 132.79502868652344
630 / 1452 : pp = 132.5094451904297
640 / 1452 : pp = 132.62892150878906
650 / 1452 : pp = 132.63499450683594
660 / 1452 : pp = 132.7379913330078
670 / 1452 : pp = 132.79046630859375
680 / 1452 : pp = 132.85842895507812
690 / 1452 : pp = 132.80364990234375
700 / 1452 : pp = 132.80477905273438
710 / 1452 : pp = 132.90170288085938
720 / 1452 : pp = 132.92971801757812
730 / 1452 : pp = 132.9019012451172
740 / 1452 : pp = 133.04811096191406
750 / 1452 : pp = 133.10877990722656
760 / 1452 : pp = 133.19189453125
770 / 1452 : pp = 133.3564910888672
780 / 1452 : pp = 133.54000854492188
790 / 1452 : pp = 133.69239807128906
800 / 1452 : pp = 133.68495178222656
810 / 1452 : pp = 133.67971801757812
820 / 1452 : pp = 133.7035675048828
830 / 1452 : pp = 133.89329528808594
840 / 1452 : pp = 133.850341796875
850 / 1452 : pp = 133.90390014648438
860 / 1452 : pp = 133.9090118408203
870 / 1452 : pp = 133.89974975585938
880 / 1452 : pp = 134.0077667236328
890 / 1452 : pp = 134.03485107421875
900 / 1452 : pp = 134.0261688232422
910 / 1452 : pp = 134.10255432128906
920 / 1452 : pp = 134.17291259765625
930 / 1452 : pp = 134.14796447753906
940 / 1452 : pp = 134.20925903320312
950 / 1452 : pp = 134.19281005859375
960 / 1452 : pp = 134.17745971679688
970 / 1452 : pp = 134.18653869628906
980 / 1452 : pp = 134.03192138671875
990 / 1452 : pp = 133.94349670410156
1000 / 1452 : pp = 133.79685974121094
1010 / 1452 : pp = 133.8438262939453
1020 / 1452 : pp = 133.9608612060547
1030 / 1452 : pp = 133.93934631347656
1040 / 1452 : pp = 134.02833557128906
1050 / 1452 : pp = 134.01734924316406
1060 / 1452 : pp = 133.95346069335938
1070 / 1452 : pp = 134.10205078125
1080 / 1452 : pp = 134.2030487060547
1090 / 1452 : pp = 134.23696899414062
1100 / 1452 : pp = 134.2230224609375
1110 / 1452 : pp = 134.0829315185547
1120 / 1452 : pp = 133.980224609375
1130 / 1452 : pp = 133.83815002441406
1140 / 1452 : pp = 133.8366241455078
1150 / 1452 : pp = 133.92108154296875
1160 / 1452 : pp = 133.94375610351562
1170 / 1452 : pp = 133.9360809326172
1180 / 1452 : pp = 133.99684143066406
1190 / 1452 : pp = 134.0944366455078
1200 / 1452 : pp = 134.11676025390625
1210 / 1452 : pp = 134.0911102294922
1220 / 1452 : pp = 134.22763061523438
1230 / 1452 : pp = 134.38043212890625
1240 / 1452 : pp = 134.39817810058594
1250 / 1452 : pp = 134.5367431640625
1260 / 1452 : pp = 134.593017578125
1270 / 1452 : pp = 134.61497497558594
1280 / 1452 : pp = 134.6423797607422
1290 / 1452 : pp = 134.64340209960938
1300 / 1452 : pp = 134.68026733398438
1310 / 1452 : pp = 134.73556518554688
1320 / 1452 : pp = 134.69021606445312
1330 / 1452 : pp = 134.66131591796875
1340 / 1452 : pp = 134.69393920898438
1350 / 1452 : pp = 134.7328643798828
1360 / 1452 : pp = 134.79405212402344
1370 / 1452 : pp = 134.71237182617188
1380 / 1452 : pp = 134.6885528564453
1390 / 1452 : pp = 134.65110778808594
1400 / 1452 : pp = 134.59584045410156
1410 / 1452 : pp = 134.6193389892578
1420 / 1452 : pp = 134.58338928222656
1430 / 1452 : pp = 134.559326171875
1440 / 1452 : pp = 134.59507751464844
1450 / 1452 : pp = 134.59365844726562 0 / 115 : pp = 226.0741729736328
10 / 115 : pp = 207.00494384765625
20 / 115 : pp = 209.26976013183594
30 / 115 : pp = 206.44662475585938
40 / 115 : pp = 205.47268676757812
50 / 115 : pp = 200.7876739501953
60 / 115 : pp = 200.13414001464844
70 / 115 : pp = 196.35549926757812
80 / 115 : pp = 194.10777282714844
90 / 115 : pp = 191.47467041015625
100 / 115 : pp = 186.61351013183594
110 / 115 : pp = 184.30374145507812
Training perplexity: 134.57826232910156
Validation perplexity:183.8900146484375
Total time : 45.410256147384644
Epoch 13 0 / 1452 : pp = 169.39393615722656
10 / 1452 : pp = 150.13232421875
20 / 1452 : pp = 147.60450744628906
30 / 1452 : pp = 144.64317321777344
40 / 1452 : pp = 146.47427368164062
50 / 1452 : pp = 143.929443359375
60 / 1452 : pp = 142.8344268798828
70 / 1452 : pp = 143.45248413085938
80 / 1452 : pp = 142.5418701171875
90 / 1452 : pp = 141.6178436279297
100 / 1452 : pp = 140.70127868652344
110 / 1452 : pp = 139.2852325439453
120 / 1452 : pp = 138.8017120361328
130 / 1452 : pp = 137.85629272460938
140 / 1452 : pp = 136.51718139648438
150 / 1452 : pp = 136.03619384765625
160 / 1452 : pp = 136.154296875
170 / 1452 : pp = 135.67037963867188
180 / 1452 : pp = 135.0376739501953
190 / 1452 : pp = 134.9230499267578
200 / 1452 : pp = 135.4241180419922
210 / 1452 : pp = 135.24581909179688
220 / 1452 : pp = 135.37957763671875
230 / 1452 : pp = 135.67652893066406
240 / 1452 : pp = 135.4161834716797
250 / 1452 : pp = 134.90895080566406
260 / 1452 : pp = 134.46754455566406
270 / 1452 : pp = 133.68577575683594
280 / 1452 : pp = 133.86770629882812
290 / 1452 : pp = 134.18475341796875
300 / 1452 : pp = 134.39132690429688
310 / 1452 : pp = 134.19985961914062
320 / 1452 : pp = 134.37998962402344
330 / 1452 : pp = 134.5557403564453
340 / 1452 : pp = 134.00686645507812
350 / 1452 : pp = 134.27749633789062
360 / 1452 : pp = 134.20286560058594
370 / 1452 : pp = 134.042724609375
380 / 1452 : pp = 133.74398803710938
390 / 1452 : pp = 133.83584594726562
400 / 1452 : pp = 133.64382934570312
410 / 1452 : pp = 134.02366638183594
420 / 1452 : pp = 134.35415649414062
430 / 1452 : pp = 134.310546875
440 / 1452 : pp = 134.3634490966797
450 / 1452 : pp = 134.15602111816406
460 / 1452 : pp = 133.86578369140625
470 / 1452 : pp = 133.43414306640625
480 / 1452 : pp = 132.90310668945312
490 / 1452 : pp = 132.646240234375
500 / 1452 : pp = 132.1982421875
510 / 1452 : pp = 132.04200744628906
520 / 1452 : pp = 131.86940002441406
530 / 1452 : pp = 131.59841918945312
540 / 1452 : pp = 131.12356567382812
550 / 1452 : pp = 130.887939453125
560 / 1452 : pp = 130.6210174560547
570 / 1452 : pp = 130.37826538085938
580 / 1452 : pp = 130.0374755859375
590 / 1452 : pp = 129.75979614257812
600 / 1452 : pp = 129.38308715820312
610 / 1452 : pp = 129.16685485839844
620 / 1452 : pp = 129.0115509033203
630 / 1452 : pp = 128.75152587890625
640 / 1452 : pp = 128.87295532226562
650 / 1452 : pp = 128.88734436035156
660 / 1452 : pp = 128.98275756835938
670 / 1452 : pp = 129.0487060546875
680 / 1452 : pp = 129.11013793945312
690 / 1452 : pp = 129.0646514892578
700 / 1452 : pp = 129.06280517578125
710 / 1452 : pp = 129.1343994140625
720 / 1452 : pp = 129.18582153320312
730 / 1452 : pp = 129.15138244628906
740 / 1452 : pp = 129.29811096191406
750 / 1452 : pp = 129.339599609375
760 / 1452 : pp = 129.4257354736328
770 / 1452 : pp = 129.61631774902344
780 / 1452 : pp = 129.802734375
790 / 1452 : pp = 129.96804809570312
800 / 1452 : pp = 129.95187377929688
810 / 1452 : pp = 129.92417907714844
820 / 1452 : pp = 129.9774627685547
830 / 1452 : pp = 130.1638946533203
840 / 1452 : pp = 130.13095092773438
850 / 1452 : pp = 130.16595458984375
860 / 1452 : pp = 130.173828125
870 / 1452 : pp = 130.170166015625
880 / 1452 : pp = 130.27032470703125
890 / 1452 : pp = 130.3022003173828
900 / 1452 : pp = 130.3071746826172
910 / 1452 : pp = 130.37939453125
920 / 1452 : pp = 130.46229553222656
930 / 1452 : pp = 130.43846130371094
940 / 1452 : pp = 130.50889587402344
950 / 1452 : pp = 130.50086975097656
960 / 1452 : pp = 130.4833221435547
970 / 1452 : pp = 130.50814819335938
980 / 1452 : pp = 130.35577392578125
990 / 1452 : pp = 130.26759338378906
1000 / 1452 : pp = 130.1064453125
1010 / 1452 : pp = 130.1472625732422
1020 / 1452 : pp = 130.27169799804688
1030 / 1452 : pp = 130.25100708007812
1040 / 1452 : pp = 130.30816650390625
1050 / 1452 : pp = 130.29803466796875
1060 / 1452 : pp = 130.2242431640625
1070 / 1452 : pp = 130.35906982421875
1080 / 1452 : pp = 130.45103454589844
1090 / 1452 : pp = 130.49838256835938
1100 / 1452 : pp = 130.484130859375
1110 / 1452 : pp = 130.35316467285156
1120 / 1452 : pp = 130.24697875976562
1130 / 1452 : pp = 130.10804748535156
1140 / 1452 : pp = 130.1076202392578
1150 / 1452 : pp = 130.195068359375
1160 / 1452 : pp = 130.19674682617188
1170 / 1452 : pp = 130.18321228027344
1180 / 1452 : pp = 130.24623107910156
1190 / 1452 : pp = 130.33905029296875
1200 / 1452 : pp = 130.3650360107422
1210 / 1452 : pp = 130.34588623046875
1220 / 1452 : pp = 130.4850616455078
1230 / 1452 : pp = 130.63160705566406
1240 / 1452 : pp = 130.64674377441406
1250 / 1452 : pp = 130.77078247070312
1260 / 1452 : pp = 130.8397674560547
1270 / 1452 : pp = 130.8511199951172
1280 / 1452 : pp = 130.88967895507812
1290 / 1452 : pp = 130.9040985107422
1300 / 1452 : pp = 130.93511962890625
1310 / 1452 : pp = 130.9759063720703
1320 / 1452 : pp = 130.92800903320312
1330 / 1452 : pp = 130.9105224609375
1340 / 1452 : pp = 130.929443359375
1350 / 1452 : pp = 130.96153259277344
1360 / 1452 : pp = 131.02381896972656
1370 / 1452 : pp = 130.9545440673828
1380 / 1452 : pp = 130.9344940185547
1390 / 1452 : pp = 130.9055938720703
1400 / 1452 : pp = 130.85386657714844
1410 / 1452 : pp = 130.8874969482422
1420 / 1452 : pp = 130.85928344726562
1430 / 1452 : pp = 130.83995056152344
1440 / 1452 : pp = 130.86659240722656
1450 / 1452 : pp = 130.86839294433594 0 / 115 : pp = 227.78428649902344
10 / 115 : pp = 207.609619140625
20 / 115 : pp = 209.92459106445312
30 / 115 : pp = 206.96240234375
40 / 115 : pp = 205.9295654296875
50 / 115 : pp = 201.0296630859375
60 / 115 : pp = 200.38059997558594
70 / 115 : pp = 196.55764770507812
80 / 115 : pp = 194.31735229492188
90 / 115 : pp = 191.66146850585938
100 / 115 : pp = 186.70437622070312
110 / 115 : pp = 184.3171844482422
Training perplexity: 130.85043334960938
Validation perplexity:183.88186645507812
Total time : 45.345656394958496
Epoch 14 0 / 1452 : pp = 164.82191467285156
10 / 1452 : pp = 146.39089965820312
20 / 1452 : pp = 142.93240356445312
30 / 1452 : pp = 140.3113555908203
40 / 1452 : pp = 142.39939880371094
50 / 1452 : pp = 139.70162963867188
60 / 1452 : pp = 138.73023986816406
70 / 1452 : pp = 139.2675018310547
80 / 1452 : pp = 138.47824096679688
90 / 1452 : pp = 137.40432739257812
100 / 1452 : pp = 136.47793579101562
110 / 1452 : pp = 135.2294464111328
120 / 1452 : pp = 134.80728149414062
130 / 1452 : pp = 133.89822387695312
140 / 1452 : pp = 132.54141235351562
150 / 1452 : pp = 132.10025024414062
160 / 1452 : pp = 132.21829223632812
170 / 1452 : pp = 131.8765106201172
180 / 1452 : pp = 131.37515258789062
190 / 1452 : pp = 131.31622314453125
200 / 1452 : pp = 131.78297424316406
210 / 1452 : pp = 131.5507354736328
220 / 1452 : pp = 131.7002410888672
230 / 1452 : pp = 131.9277801513672
240 / 1452 : pp = 131.72166442871094
250 / 1452 : pp = 131.225830078125
260 / 1452 : pp = 130.7496337890625
270 / 1452 : pp = 129.9896697998047
280 / 1452 : pp = 130.10594177246094
290 / 1452 : pp = 130.41644287109375
300 / 1452 : pp = 130.5982208251953
310 / 1452 : pp = 130.36329650878906
320 / 1452 : pp = 130.5633544921875
330 / 1452 : pp = 130.77252197265625
340 / 1452 : pp = 130.273193359375
350 / 1452 : pp = 130.47889709472656
360 / 1452 : pp = 130.4348602294922
370 / 1452 : pp = 130.28126525878906
380 / 1452 : pp = 130.02786254882812
390 / 1452 : pp = 130.1564483642578
400 / 1452 : pp = 129.98440551757812
410 / 1452 : pp = 130.37721252441406
420 / 1452 : pp = 130.71859741210938
430 / 1452 : pp = 130.65939331054688
440 / 1452 : pp = 130.72987365722656
450 / 1452 : pp = 130.56272888183594
460 / 1452 : pp = 130.28195190429688
470 / 1452 : pp = 129.90936279296875
480 / 1452 : pp = 129.42857360839844
490 / 1452 : pp = 129.18077087402344
500 / 1452 : pp = 128.7588348388672
510 / 1452 : pp = 128.6303253173828
520 / 1452 : pp = 128.47616577148438
530 / 1452 : pp = 128.21148681640625
540 / 1452 : pp = 127.7218017578125
550 / 1452 : pp = 127.50067138671875
560 / 1452 : pp = 127.27574157714844
570 / 1452 : pp = 127.05399322509766
580 / 1452 : pp = 126.73983001708984
590 / 1452 : pp = 126.43692779541016
600 / 1452 : pp = 126.06050109863281
610 / 1452 : pp = 125.82952880859375
620 / 1452 : pp = 125.66295623779297
630 / 1452 : pp = 125.39354705810547
640 / 1452 : pp = 125.49463653564453
650 / 1452 : pp = 125.48816680908203
660 / 1452 : pp = 125.58712005615234
670 / 1452 : pp = 125.65978240966797
680 / 1452 : pp = 125.71456146240234
690 / 1452 : pp = 125.66937255859375
700 / 1452 : pp = 125.65900421142578
710 / 1452 : pp = 125.7271499633789
720 / 1452 : pp = 125.77758026123047
730 / 1452 : pp = 125.74129486083984
740 / 1452 : pp = 125.8759765625
750 / 1452 : pp = 125.91793823242188
760 / 1452 : pp = 125.99595642089844
770 / 1452 : pp = 126.18113708496094
780 / 1452 : pp = 126.35147094726562
790 / 1452 : pp = 126.50797271728516
800 / 1452 : pp = 126.49759674072266
810 / 1452 : pp = 126.48113250732422
820 / 1452 : pp = 126.52528381347656
830 / 1452 : pp = 126.705810546875
840 / 1452 : pp = 126.67517852783203
850 / 1452 : pp = 126.74176025390625
860 / 1452 : pp = 126.74151611328125
870 / 1452 : pp = 126.73414611816406
880 / 1452 : pp = 126.83026885986328
890 / 1452 : pp = 126.88519287109375
900 / 1452 : pp = 126.88053894042969
910 / 1452 : pp = 126.97138214111328
920 / 1452 : pp = 127.04660034179688
930 / 1452 : pp = 127.03763580322266
940 / 1452 : pp = 127.1126480102539
950 / 1452 : pp = 127.09610748291016
960 / 1452 : pp = 127.0873794555664
970 / 1452 : pp = 127.10343933105469
980 / 1452 : pp = 126.96441650390625
990 / 1452 : pp = 126.88519287109375
1000 / 1452 : pp = 126.7336654663086
1010 / 1452 : pp = 126.77796936035156
1020 / 1452 : pp = 126.89826202392578
1030 / 1452 : pp = 126.88761138916016
1040 / 1452 : pp = 126.95309448242188
1050 / 1452 : pp = 126.96478271484375
1060 / 1452 : pp = 126.89324188232422
1070 / 1452 : pp = 127.03242492675781
1080 / 1452 : pp = 127.13228607177734
1090 / 1452 : pp = 127.173095703125
1100 / 1452 : pp = 127.15975189208984
1110 / 1452 : pp = 127.0392074584961
1120 / 1452 : pp = 126.94032287597656
1130 / 1452 : pp = 126.80693054199219
1140 / 1452 : pp = 126.81315612792969
1150 / 1452 : pp = 126.90467834472656
1160 / 1452 : pp = 126.91236114501953
1170 / 1452 : pp = 126.90897369384766
1180 / 1452 : pp = 126.98052215576172
1190 / 1452 : pp = 127.07483673095703
1200 / 1452 : pp = 127.10216522216797
1210 / 1452 : pp = 127.08258819580078
1220 / 1452 : pp = 127.22943878173828
1230 / 1452 : pp = 127.38563537597656
1240 / 1452 : pp = 127.40538024902344
1250 / 1452 : pp = 127.53369140625
1260 / 1452 : pp = 127.59293365478516
1270 / 1452 : pp = 127.61489868164062
1280 / 1452 : pp = 127.6484375
1290 / 1452 : pp = 127.65257263183594
1300 / 1452 : pp = 127.69329833984375
1310 / 1452 : pp = 127.74549102783203
1320 / 1452 : pp = 127.7043228149414
1330 / 1452 : pp = 127.6866683959961
1340 / 1452 : pp = 127.70913696289062
1350 / 1452 : pp = 127.73233795166016
1360 / 1452 : pp = 127.7855224609375
1370 / 1452 : pp = 127.71918487548828
1380 / 1452 : pp = 127.69987487792969
1390 / 1452 : pp = 127.6697998046875
1400 / 1452 : pp = 127.61137390136719
1410 / 1452 : pp = 127.6404037475586
1420 / 1452 : pp = 127.61094665527344
1430 / 1452 : pp = 127.58216857910156
1440 / 1452 : pp = 127.61477661132812
1450 / 1452 : pp = 127.61964416503906 0 / 115 : pp = 228.21578979492188
10 / 115 : pp = 208.11244201660156
20 / 115 : pp = 210.688232421875
30 / 115 : pp = 207.62408447265625
40 / 115 : pp = 206.45184326171875
50 / 115 : pp = 201.52760314941406
60 / 115 : pp = 200.7784881591797
70 / 115 : pp = 196.83067321777344
80 / 115 : pp = 194.6357879638672
90 / 115 : pp = 191.9783935546875
100 / 115 : pp = 186.8787841796875
110 / 115 : pp = 184.35252380371094
Training perplexity: 127.60413360595703
Validation perplexity:183.8877410888672
Total time : 41.6636528968811
Epoch 15 0 / 1452 : pp = 156.81654357910156
10 / 1452 : pp = 142.1070556640625
20 / 1452 : pp = 139.55076599121094
30 / 1452 : pp = 136.63551330566406
40 / 1452 : pp = 138.5840606689453
50 / 1452 : pp = 136.052734375
60 / 1452 : pp = 134.93019104003906
70 / 1452 : pp = 135.65206909179688
80 / 1452 : pp = 135.2620086669922
90 / 1452 : pp = 134.314697265625
100 / 1452 : pp = 133.4916229248047
110 / 1452 : pp = 132.26052856445312
120 / 1452 : pp = 131.7714080810547
130 / 1452 : pp = 130.77365112304688
140 / 1452 : pp = 129.5411834716797
150 / 1452 : pp = 129.0791778564453
160 / 1452 : pp = 129.21920776367188
170 / 1452 : pp = 128.7528839111328
180 / 1452 : pp = 128.22279357910156
190 / 1452 : pp = 128.18177795410156
200 / 1452 : pp = 128.58758544921875
210 / 1452 : pp = 128.3906707763672
220 / 1452 : pp = 128.5266571044922
230 / 1452 : pp = 128.80563354492188
240 / 1452 : pp = 128.61886596679688
250 / 1452 : pp = 128.13172912597656
260 / 1452 : pp = 127.69220733642578
270 / 1452 : pp = 126.96150970458984
280 / 1452 : pp = 127.04702758789062
290 / 1452 : pp = 127.33565521240234
300 / 1452 : pp = 127.55929565429688
310 / 1452 : pp = 127.38514709472656
320 / 1452 : pp = 127.52171325683594
330 / 1452 : pp = 127.68690490722656
340 / 1452 : pp = 127.18340301513672
350 / 1452 : pp = 127.4073257446289
360 / 1452 : pp = 127.30432891845703
370 / 1452 : pp = 127.17618560791016
380 / 1452 : pp = 126.92579650878906
390 / 1452 : pp = 127.02473449707031
400 / 1452 : pp = 126.8515625
410 / 1452 : pp = 127.211669921875
420 / 1452 : pp = 127.51788330078125
430 / 1452 : pp = 127.47386169433594
440 / 1452 : pp = 127.57164001464844
450 / 1452 : pp = 127.3601303100586
460 / 1452 : pp = 127.09434509277344
470 / 1452 : pp = 126.71922302246094
480 / 1452 : pp = 126.24349212646484
490 / 1452 : pp = 125.98778533935547
500 / 1452 : pp = 125.59526824951172
510 / 1452 : pp = 125.4450912475586
520 / 1452 : pp = 125.29247283935547
530 / 1452 : pp = 125.03536224365234
540 / 1452 : pp = 124.5813980102539
550 / 1452 : pp = 124.33724212646484
560 / 1452 : pp = 124.08995819091797
570 / 1452 : pp = 123.86637878417969
580 / 1452 : pp = 123.53152465820312
590 / 1452 : pp = 123.20321655273438
600 / 1452 : pp = 122.85673522949219
610 / 1452 : pp = 122.64250946044922
620 / 1452 : pp = 122.4958724975586
630 / 1452 : pp = 122.22386169433594
640 / 1452 : pp = 122.31143188476562
650 / 1452 : pp = 122.30093383789062
660 / 1452 : pp = 122.39427947998047
670 / 1452 : pp = 122.45440673828125
680 / 1452 : pp = 122.51146697998047
690 / 1452 : pp = 122.4854736328125
700 / 1452 : pp = 122.48600006103516
710 / 1452 : pp = 122.56084442138672
720 / 1452 : pp = 122.59059143066406
730 / 1452 : pp = 122.55529022216797
740 / 1452 : pp = 122.69409942626953
750 / 1452 : pp = 122.76456451416016
760 / 1452 : pp = 122.84437561035156
770 / 1452 : pp = 123.02527618408203
780 / 1452 : pp = 123.20509338378906
790 / 1452 : pp = 123.36305236816406
800 / 1452 : pp = 123.36852264404297
810 / 1452 : pp = 123.36799621582031
820 / 1452 : pp = 123.39976501464844
830 / 1452 : pp = 123.59362030029297
840 / 1452 : pp = 123.56946563720703
850 / 1452 : pp = 123.63800811767578
860 / 1452 : pp = 123.63983917236328
870 / 1452 : pp = 123.64148712158203
880 / 1452 : pp = 123.7568588256836
890 / 1452 : pp = 123.7885513305664
900 / 1452 : pp = 123.79640197753906
910 / 1452 : pp = 123.86153411865234
920 / 1452 : pp = 123.92941284179688
930 / 1452 : pp = 123.9125747680664
940 / 1452 : pp = 123.95559692382812
950 / 1452 : pp = 123.93928527832031
960 / 1452 : pp = 123.94294738769531
970 / 1452 : pp = 123.95547485351562
980 / 1452 : pp = 123.8229751586914
990 / 1452 : pp = 123.73727416992188
1000 / 1452 : pp = 123.59091186523438
1010 / 1452 : pp = 123.634765625
1020 / 1452 : pp = 123.76506042480469
1030 / 1452 : pp = 123.75485229492188
1040 / 1452 : pp = 123.807861328125
1050 / 1452 : pp = 123.79156494140625
1060 / 1452 : pp = 123.73054504394531
1070 / 1452 : pp = 123.8615951538086
1080 / 1452 : pp = 123.96564483642578
1090 / 1452 : pp = 124.02104187011719
1100 / 1452 : pp = 124.012939453125
1110 / 1452 : pp = 123.87582397460938
1120 / 1452 : pp = 123.775390625
1130 / 1452 : pp = 123.63182067871094
1140 / 1452 : pp = 123.62391662597656
1150 / 1452 : pp = 123.71013641357422
1160 / 1452 : pp = 123.72423553466797
1170 / 1452 : pp = 123.71726989746094
1180 / 1452 : pp = 123.79032897949219
1190 / 1452 : pp = 123.87883758544922
1200 / 1452 : pp = 123.9125747680664
1210 / 1452 : pp = 123.90140533447266
1220 / 1452 : pp = 124.03245544433594
1230 / 1452 : pp = 124.19799041748047
1240 / 1452 : pp = 124.21469116210938
1250 / 1452 : pp = 124.34103393554688
1260 / 1452 : pp = 124.4041976928711
1270 / 1452 : pp = 124.42852020263672
1280 / 1452 : pp = 124.46656036376953
1290 / 1452 : pp = 124.4811019897461
1300 / 1452 : pp = 124.52384185791016
1310 / 1452 : pp = 124.57533264160156
1320 / 1452 : pp = 124.5398178100586
1330 / 1452 : pp = 124.52598571777344
1340 / 1452 : pp = 124.53311157226562
1350 / 1452 : pp = 124.57759094238281
1360 / 1452 : pp = 124.63385772705078
1370 / 1452 : pp = 124.58133697509766
1380 / 1452 : pp = 124.55769348144531
1390 / 1452 : pp = 124.54011535644531
1400 / 1452 : pp = 124.4884033203125
1410 / 1452 : pp = 124.51226806640625
1420 / 1452 : pp = 124.49683380126953
1430 / 1452 : pp = 124.4754638671875
1440 / 1452 : pp = 124.50164031982422
1450 / 1452 : pp = 124.50894165039062 0 / 115 : pp = 230.8488006591797
10 / 115 : pp = 209.2509002685547
20 / 115 : pp = 211.68577575683594
30 / 115 : pp = 208.44056701660156
40 / 115 : pp = 207.2039337158203
50 / 115 : pp = 202.1859588623047
60 / 115 : pp = 201.34739685058594
70 / 115 : pp = 197.4251251220703
80 / 115 : pp = 195.2623291015625
90 / 115 : pp = 192.592529296875
100 / 115 : pp = 187.39553833007812
110 / 115 : pp = 184.791259765625
Training perplexity: 124.4933853149414
Validation perplexity:184.32510375976562
Total time : 40.856229066848755 0 / 128 : pp = 184.6475067138672
10 / 128 : pp = 176.8856964111328
20 / 128 : pp = 164.3444366455078
30 / 128 : pp = 167.85472106933594
40 / 128 : pp = 169.25367736816406
50 / 128 : pp = 168.86561584472656
60 / 128 : pp = 168.11801147460938
70 / 128 : pp = 165.4105224609375
80 / 128 : pp = 162.91146850585938
90 / 128 : pp = 161.29742431640625
100 / 128 : pp = 162.45989990234375
110 / 128 : pp = 162.6834716796875
120 / 128 : pp = 164.3359832763672
=-==-==-==-==-=
Test perplexity: 164.0149383544922
=-==-==-==-==-=
更详细的内容请参考下面链接
https://github.com/weizhenzhao/cs224d_nlp_problem_set2
cs224d 作业 problem set2 (三) 用RNNLM模型实现Language Model,来预测下一个单词的出现的更多相关文章
- cs224d 作业 problem set2 (一) 用tensorflow纯手写实现sofmax 函数,线性判别分析,命名实体识别
Hi Dear Today we will use tensorflow to implement the softmax regression and linear classifier algor ...
- cs224d 作业 problem set2 (二) TensorFlow 实现命名实体识别
神经网络在命名实体识别中的应用 所有的这些包括之前的两篇都可以通过tensorflow 模型的托管部署到 google cloud 上面,发布成restful接口,从而与任何的ERP,CRM系统集成. ...
- cs224d 作业 problem set1 (一) 主要是实现word2vector模型,SGD,CBOW,Softmax,算法
''' Created on 2017年9月13日 @author: weizhen ''' import numpy as np def sigmoid(x): return 1 / (1 + np ...
- cs224d 作业 problem set3 (一) 实现Recursive Nerual Net Work 递归神经网络
1.Recursive Nerual Networks能够更好地体现每个词与词之间语法上的联系这里我们选取的损失函数仍然是交叉熵函数 2.整个网络的结构如下图所示: 每个参数的更新时的梯队值如何计算, ...
- cs224d 作业 problem set1 (二) 简单的情感分析
使用在上一篇博客中训练好的wordvector 在这一节进行情感分析. 因为在上一节中得到的是一个词就是一个向量 所以一句话便是一个矩阵,矩阵的每一列表示一个词向量 情感分析的前提是已知一句话是 (超 ...
- 应用HTK搭建语音拨号系统3:创建绑定状态的三音素HMM模型
选自:http://maotong.blog.hexun.com/6261873_d.html 苏统华 哈尔滨工业大学人工智能研究室 2006年10月30日 声明:版权所有,转载请注明作者和来源 该系 ...
- 三、TensorFlow模型的保存和加载
1.模型的保存: import tensorflow as tf v1 = tf.Variable(1.0,dtype=tf.float32) v2 = tf.Variable(2.0,dtype=t ...
- C++二级指针第三种内存模型
#include "stdio.h" #include "stdlib.h" #include "string.h" void main() ...
- ESPlatform 支持的三种群集模型 —— ESFramework通信框架 4.0 进阶(09)
对于最多几千人同时在线的通信应用,通常使用单台服务器就可以支撑.但是,当同时在线的用户数达到几万.几十万.甚至百万的时候,我们就需要很多的服务器来分担负载.但是,依据什么规则和结构来组织这些服务器,并 ...
随机推荐
- 测开之路四十一:常用的jquery函数
jQuery选择器菜鸟教程:https://www.runoob.com/jquery/jquery-selectors.html 引用jquery2.1.1标签:<script src=&qu ...
- 关于Http请求Cookie问题
在Http请求中,很多时候我们要设置Cookie和获取返回的Cookie,在这个问题上踩了一个很大的坑,主要是两个问题: 1.不能获取到重定向返回的Cookie: 2.两次请求返回的Cookie是相同 ...
- fat32转ntfs ,Win7系统提示对于目标文件系统文件过大解决教程
系统之家 发布时间:18-05-3117:56 很多Win7用户在复制较大的文件时,系统会弹出窗口提示“对于目标文件系统,文件XXX过大”,出现这种情况的原因是FAT32的文件系统不支持复制大于4g的 ...
- Hadoop2.2.0在Ubuntu编译失败解决方法
[INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE ...
- laravel在路由中设置中间件
//单个 路由 Route::get( 'admin/admin/index' , [ 'middleware' => 'old', 'uses' => 'Admin\AdminContr ...
- How To Release and/or Renew IP Addresses on Windows XP | 2000 | NT
Type 'ipconfig' (without the quotes) to view the status of the computer's IP address(es). If the com ...
- SSL连接出现的问题
客户端向服务器发送数据时,份两种情况,SSL单向验证和SSL双向验证 1.SSL单向验证时 代码如下: import java.io.IOException; import java.util.Has ...
- 《零基础学习Python制作ArcGIS自定义工具》课程简介
Python for ArcGIS Python for ArcGIS是借助Python语言实现ArcGIS自动化行为的综合,它不止是如课程标题所述的“制作ArcGIS自定义工具”,还包括使用Pyth ...
- db2 连接数据库与断开数据库
连接数据库: connect to db_name user db_user using db_pass 断开连接: connect resetdisconnect current quit是退出交 ...
- Codeforces 1188C DP 鸽巢原理
题意:定义一个序列的beauty值为序列中元素之差绝对值的最小值,现在给你一个数组,问所有长度为k的子序列的beauty值的和是多少? 思路:(官方题解)我们先解决这个问题的子问题:我们可以求出bea ...