Open-domain QA

Overview

The whole system is consisted with Document Retriever and Document Reader. The Document Retriever returns top five Wikipedia articles given any question, then the Document Reader will process these articles.

Document Retriever

The Retriever compares the TF-IDF weighted bag of word vectors between the articles and questions. And if take the word order into account with n-gram features, the performence will be better. In the paper, useing bigram counts performed best. It used hashing of (Weinberger et al., 2009) to map the bigrams to \(2^{24}\) bins with an unsigned murmur3 hash to preserv speed and memory efficiency.

Document Reader

The Document Reader was consisted of a multi-layer BiLSTM and a RNN layer. The input first was processed by a RNN, and then a multi-layer BiLSTM.

Paragraph encoding was comprisied of the following pars:

  • Word embeddings:

    • 300d Glove, only fine-tune the 1000 most frequent question words because the representations of some keu words such as what, how, whick, many could be crucial for QA systems.
  • Exact match:
    • Three simple features, indicating whether \(p_i\) can be exactly matched to one question word in \(q\), either in its original, lowercase or lemma form. It's helpful as the ablation analysis.
  • Token features:
    • POS, NER TF
  • Aligned question embedding:
    • the embedding is actually an attention mechanism between question and paragraph. It was computed as following: (\(\alpha(\cdot)\) is a single dense layer with ReLU nonlinearity.)
      \[\begin{aligned}
      &a_{i,j} = \frac{exp(\alpha(E(p_i))\cdot \alpha(E(q_j)))}{\sum_{j^`}exp(\alpha(E(p_i)) \cdot \alpha(E(q_{j^`})))}\\
      &f_{align(p_i)} = \sum_j a_{i,j}E(q_j)
      \end{aligned}\]

Question encoding

Only apply a recurrent NN on top of word embedding of \(q_i\) and combine the resulting hidden units into one single vector: \(\{q_1, \cdots, q_l\} \rightarrow q\). The \(q\) was computed as following:
\[
\begin{aligned}
& b_j = \frac{exp(w \cdot q_j)}{\sum_{j^`} exp(w \cdot q_{j^`})}\\
& q = \sum_j b_jq_j
\end{aligned}
\]
where \(b_j\) encodes the importance of each question word. I think the computation is very similar with the question self attention.

Prediction

Take the \(p\) and \(q\) as input to train a classifier to predict the correct span positions.
\[\begin{aligned}
P_{start}(i) & \propto exp(p_iW_sq)\\
P_{end}(i) & \propto exp(p_iW_eq)
\end{aligned}
\]
Then select the best span from token \(i\) and token \(i^`\) such that \(i \leq i^` \leq i+15\) and \(P_{start}(i) \times P_{end}(i^`)\) is maximized.

Analysis

The ablation analysis result:

As the result showing, the aligned feature and exact_match feature are complementary and similar role as it does not matter when removing them respectively, but the performance drops dramatically wehn removing both of them.

Open-Domain QA -paper的更多相关文章

  1. EasyMesh - A Two-Dimensional Quality Mesh Generator

    EasyMesh - A Two-Dimensional Quality Mesh Generator eryar@163.com Abstract. EasyMesh is developed by ...

  2. 深度学习课程笔记(十七)Meta-learning (Model Agnostic Meta Learning)

    深度学习课程笔记(十七)Meta-learning (Model Agnostic Meta Learning) 2018-08-09 12:21:33 The video tutorial can ...

  3. (转) AdversarialNetsPapers

      本文转自:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Pap ...

  4. Official Program for CVPR 2015

    From:  http://www.pamitc.org/cvpr15/program.php Official Program for CVPR 2015 Monday, June 8 8:30am ...

  5. 生成对抗网络资源 Adversarial Nets Papers

    来源:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Papers ...

  6. RAC的QA

    RAC: Frequently Asked Questions [ID 220970.1]   修改时间 13-JAN-2011     类型 FAQ     状态 PUBLISHED   Appli ...

  7. How to implement an algorithm from a scientific paper

    Author: Emmanuel Goossaert 翻译 This article is a short guide to implementing an algorithm from a scie ...

  8. paper 54 :图像频率的理解

    我一直在思考一个问题,图像增强以后,哪些方面的特征最为显著,思来想去,无果而终!翻看了一篇知网的paper,基于保真度(VIF)的增强图像质量评价,文章中指出无参考质量评价,可以从三个方面考虑:平均梯 ...

  9. 如何写出优秀的研究论文 Chapter 1. How to Write an A+ Research Paper

    This Chapter outlines the logical steps to writing a good research paper. To achieve supreme excelle ...

随机推荐

  1. C++常见问题解答博客合集

    1 关于宏 https://blog.csdn.net/hanchaoman/article/details/8809951

  2. python3 练手实例3 摄氏温度与华氏温度转换

    def wd(): w=input('请输入一个摄氏温度或者一个华氏温度,如,34c/C or 34f/F:') if w[-1] in ['c','C']: w=float(w[:-1]) hs=1 ...

  3. es6中的class的使用

    ---恢复内容开始--- es5中生成实例对象的传统方法是通过构造函数: function Point(x,y){ this.x = x; this.y = y; } Point.prototype. ...

  4. es6常用语法学习笔记

    1.let和const的常规使用 let声明的变量不存在预解析 let声明的变量不允许重复使用(在同一个作用域内) ES6引入了块级作用域{},块内部定义的变量,在外部是不可以访问到的 使用let在f ...

  5. Centos7.2下部署Java开发环境

    1.安装JDK 如果以前安装过JDK,想要重新安装可执行如下命令进行卸载,这里安装的是JDK1.8: 先查询: rpm -qa|grep jdk 然后再通过下面命令进行卸载 rpm -e --node ...

  6. vue keep-alive内置缓存组件

    1.当组件在keep-alive被切换时将会执行activeted和deactiveted两个生命周期 2.inlude 正则表达式或字符串 ,只有符合条件的组件会被缓存 exclude正则表达式或字 ...

  7. 利用C# 窗体设计 写一个抽奖游戏

    老师布置了一个任务,要求我们做一个抽奖游戏,以下是我个人制作的一个作品与写项目的过程. 我们用到了8个pictureBox控件和一个button,设置好大小,并且编排成一个九宫个形状 添加窗体的背景图 ...

  8. 【转】一文掌握 Linux 性能分析之 CPU 篇

    [转]一文掌握 Linux 性能分析之 CPU 篇 平常工作会涉及到一些 Linux 性能分析的问题,因此决定总结一下常用的一些性能分析手段,仅供参考. 说到性能分析,基本上就是 CPU.内存.磁盘 ...

  9. Appnium-API-Dvice

    Device Activity Start Activity Start an Android activity by providing package name and activity name ...

  10. Timeline高级扩展

    转载于http://forum.china.unity3d.com/thread-32200-1-1.html通过demo讲解了timeline更加复杂的使用方式 Timeline是创建过场动画和影片 ...