CS224d 单隐层全连接网络处理英文命名实体识别tensorflow

什么是NER？

命名实体识别（NER）是指识别文本中具有特定意义的实体，主要包括人名、地名、机构名、专有名词等。命名实体识别是信息提取、问答系统、句法分析、机器翻译等应用领域的重要基础工具，作为结构化信息提取的重要步骤。

NER具体任务

1.确定实体位置 2.确定实体类别

给一个单词，我们需要根据上下文判断，它属于下面四类的哪一个，如果都不属于，则类别为0，即不是实体，所以这是一个需要分成 5 类的问题：

• Person (PER)

• Organization (ORG)

• Location (LOC)

• Miscellaneous (MISC)

训练数据有两列，第一列是单词，第二列是标签。

EU    ORG

rejects    O

German    MISC

Peter    PER

BRUSSELS    LOC

2.模型：

输入层的 x^(t) 为以 x_t 为中心的窗口大小为3的上下文语境，x_t 是 one-hot 向量，x_t 与 L 作用后就是相应的词向量，词向量的长度为 d = 50 ：

建立一个只有一个隐藏层的神经网络，隐藏层维度是 100，y^ 就是得到的预测值，维度是 5：

用交叉熵来计算误差：

loss(J)对各个参数进行求导：

链式法则

在 TensorFlow 中求导是自动实现的，这里用Adam优化算法更新梯度，不断地迭代，使得loss越来越小直至收敛。

3.具体实现：

在 def test_NER() 中，我们进行 max_epochs 次迭代，每次，用 training data 训练模型得到一对 train_loss, train_acc，再用这个模型去预测 validation data，得到一对 val_loss, predictions，我们选择最小的 val_loss，并把相应的参数 weights 保存起来，最后我们是要用这些参数去预测 test data 的类别标签：

def test_NER():

  config = Config()

  with tf.Graph().as_default():

    model = NERModel(config)   

    init = tf.initialize_all_variables()

    saver = tf.train.Saver()

    with tf.Session() as session:

    # 最好的值时，它的 loss 它的 迭代次数 epoch

      best_val_loss = float('inf')

      best_val_epoch = 0

      session.run(init)

      for epoch in xrange(config.max_epochs):

        print 'Epoch {}'.format(epoch)

        start = time.time()

        ###

        train_loss, train_acc = model.run_epoch(session, model.X_train,

                                                model.y_train)

        # 2.用这个model去预测 dev 数据，得到loss 和 prediction

        val_loss, predictions = model.predict(session, model.X_dev, model.y_dev)

        print 'Training loss: {}'.format(train_loss)

        print 'Training acc: {}'.format(train_acc)

        print 'Validation loss: {}'.format(val_loss)

        if val_loss < best_val_loss:

          best_val_loss = val_loss

          best_val_epoch = epoch

          if not os.path.exists("./weights"):

            os.makedirs("./weights")

          saver.save(session, './weights/ner.weights')

        if epoch - best_val_epoch > config.early_stopping:

          break

        ###

        # 把 dev 的lable数据放进去，计算prediction的confusion

        confusion = calculate_confusion(config, predictions, model.y_dev)

        print_confusion(confusion, model.num_to_tag)

        print 'Total time: {}'.format(time.time() - start)

      # 再次加载保存过的 weights，用 test 数据做预测，得到预测结果

      saver.restore(session, './weights/ner.weights')

      print 'Test'

      print '=-=-='

      print 'Writing predictions to q2_test.predicted'

      _, predictions = model.predict(session, model.X_test, model.y_test)

      save_predictions(predictions, "q2_test.predicted")   

if __name__ == "__main__":

  test_NER()

4.模型训练过程：

首先导入数据 training，validation，test：

# Load the training set

docs = du.load_dataset('data/ner/train')

# Load the dev set (for tuning hyperparameters)

docs = du.load_dataset('data/ner/dev')

# Load the test set (dummy labels only)

docs = du.load_dataset('data/ner/test.masked')

把单词转化成 one-hot 向量后，再转化成词向量：

def add_embedding(self):

    # The embedding lookup is currently only implemented for the CPU

    with tf.device('/cpu:0'):

      embedding = tf.get_variable('Embedding', [len(self.wv), self.config.embed_size])

      # lookup window大小的context的word embedding

      window = tf.nn.embedding_lookup(embedding, self.input_placeholder)

      window = tf.reshape(

        window, [-1, self.config.window_size * self.config.embed_size])

      return window

建立神经层，包括用 xavier 去初始化第一层， L2 正则化和用 dropout 来减小过拟合的处理：

def add_model(self, window):

    with tf.variable_scope('Layer1', initializer=xavier_weight_init()) as scope:

      W = tf.get_variable(

          'W', [self.config.window_size * self.config.embed_size,

                self.config.hidden_size])

      b1 = tf.get_variable('b1', [self.config.hidden_size])

      h = tf.nn.tanh(tf.matmul(window, W) + b1)

      if self.config.l2:

          tf.add_to_collection('total_loss', 0.5 * self.config.l2 * tf.nn.l2_loss(W))    

    with tf.variable_scope('Layer2', initializer=xavier_weight_init()) as scope:

      U = tf.get_variable('U', [self.config.hidden_size, self.config.label_size])

      b2 = tf.get_variable('b2', [self.config.label_size])

      y = tf.matmul(h, U) + b2

      if self.config.l2:

          tf.add_to_collection('total_loss', 0.5 * self.config.l2 * tf.nn.l2_loss(U))

    output = tf.nn.dropout(y, self.dropout_placeholder)

    return output

用 cross entropy 来计算 loss：

def add_loss_op(self, y):

    cross_entropy = tf.reduce_mean(

        tf.nn.softmax_cross_entropy_with_logits(y, self.labels_placeholder))

    tf.add_to_collection('total_loss', cross_entropy)

    loss = tf.add_n(tf.get_collection('total_loss'))        

    return loss

接着用 Adam Optimizer 把loss最小化：

 def add_training_op(self, loss):

    optimizer = tf.train.AdamOptimizer(self.config.lr)

    global_step = tf.Variable(0, name='global_step', trainable=False)

    train_op = optimizer.minimize(loss, global_step=global_step)

    return train_op

每一次训练后，得到了最小化 loss 相应的 weights。

完整程序见：code

CS224d 单隐层全连接网络处理英文命名实体识别tensorflow的更多相关文章

基于BERT预训练的中文命名实体识别TensorFlow实现
BERT-BiLSMT-CRF-NERTensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuni ...
NLP入门（八）使用CRF++实现命名实体识别(NER)
CRF与NER简介 CRF,英文全称为conditional random field, 中文名为条件随机场,是给定一组输入随机变量条件下另一组输出随机变量的条件概率分布模型,其特点是假设输出随机 ...
实现一个单隐层神经网络python
看过首席科学家NG的深度学习公开课很久了,一直没有时间做课后编程题,做完想把思路总结下来,仅仅记录编程主线. 一引用工具包 import numpy as np import matplotlib. ...
cs224d 作业 problem set2 (二) TensorFlow 实现命名实体识别
神经网络在命名实体识别中的应用所有的这些包括之前的两篇都可以通过tensorflow 模型的托管部署到 google cloud 上面,发布成restful接口,从而与任何的ERP,CRM系统集成. ...
HMM（隐马尔科夫模型）与分词、词性标注、命名实体识别
转载自 http://www.cnblogs.com/skyme/p/4651331.html HMM(隐马尔可夫模型)是用来描述隐含未知参数的统计模型,举一个经典的例子:一个东京的朋友每天根据天气{ ...
基于bert的命名实体识别，pytorch实现，支持中文/英文【源学计划】
声明:为了帮助初学者快速入门和上手,开始源学计划,即通过源代码进行学习.该计划收取少量费用,提供有质量保证的源码,以及详细的使用说明. 第一个项目是基于bert的命名实体识别(name entity ...
DL4NLP —— 序列标注：BiLSTM-CRF模型做基于字的中文命名实体识别
三个月之前 NLP 课程结课,我们做的是命名实体识别的实验.在MSRA的简体中文NER语料(我是从这里下载的,非官方出品,可能不是SIGHAN 2006 Bakeoff-3评测所使用的原版语料)上训练 ...
神经网络结构在命名实体识别（NER）中的应用
神经网络结构在命名实体识别(NER)中的应用近年来,基于神经网络的深度学习方法在自然语言处理领域已经取得了不少进展.作为NLP领域的基础任务-命名实体识别(Named Entity Recognit ...
2. 知识图谱-命名实体识别（NER）详解
1. 通俗易懂解释知识图谱(Knowledge Graph) 2. 知识图谱-命名实体识别(NER)详解 3. 哈工大LTP解析 1. 前言在解了知识图谱的全貌之后,我们现在慢慢的开始深入的学习知识 ...

随机推荐

POJ 2127 Greatest Common Increasing Subsequence
You are given two sequences of integer numbers. Write a program to determine their common increasing ...
Qt之QEvent（所有事件的翻译）
QEvent 类是所有事件类的基类,事件对象包含事件参数. Qt 的主事件循环(QCoreApplication::exec())从事件队列中获取本地窗口系统事件,将它们转化为 QEvents,然后将 ...
resolution will not be reattempted until the update interval of repository-group has elapsed or updates are forced
Failed to execute goal on project safetan-web: Could not resolve dependencies for project com.safeta ...
JavaScript学习 - 基础(六) - DOM基础操作
DOM: DOM定义了访问HTML 和XML 文档的标准:1.核心DOM 针对结构化文档的标准模型2.XMK DOM 针对XML文档的标准模型3.HTML DOM 针对HTML文档的标准模型 DOM节 ...
python - 迭代器(迭代协议/可迭代对象)
迭代器 # 迭代器协议 # 迭代协议:对象必须提供一个next方法,执行该方法要么返回迭代中的下一项,要么就触发一个 StopIteration 异常,以终止迭代(只能往后走不能往前退) # 可迭代对 ...
js常用的工具函数
JS选取DOM元素的方法注意:原生JS选取DOM元素比使用jQuery类库选取要快很多1.通过ID选取元素document.getElementById('myid');2.通过CLASS选取元素do ...
Bootstrap模态框（一个页面显示多个）的使用
在一个页面显示多个模态框时要讲每个模态框用div包裹起来,否咋会产生格式错误. <html> <head> <meta charset="utf-8" ...
【vim】跳转到上/下一个修改的位置
当你编辑一个很大的文件时,经常要做的事是在某处进行修改,然后跳到另外一处.如果你想跳回之前修改的地方,使用命令: Ctrl+o 来回到之前修改的地方类似的: Ctrl+i 会回退上面的跳动.
【vim】缩写 :ab [缩写] [要替换的文字]
一个很可能是最令人印象深刻的窍门是你可以在 Vim 中定义缩写,它可以实时地把你输入的东西替换为另外的东西.语法格式如下: :ab [缩写] [要替换的文字] 一个通用的例子是: :ab asap a ...
David McCullough, Jr.为韦斯利高中毕业生演讲〈你并不特别〉
Dr. Wong, Dr. Keough, Mrs.Novogroski, Ms. Curran, members of the board of education, familyand frien ...

CS224d 单隐层全连接网络处理英文命名实体识别tensorflow

4.模型训练过程：

CS224d 单隐层全连接网络处理英文命名实体识别tensorflow的更多相关文章

随机推荐

热门专题