Open-domain QA

Overview

The whole system is consisted with Document Retriever and Document Reader. The Document Retriever returns top five Wikipedia articles given any question, then the Document Reader will process these articles.

Document Retriever

The Retriever compares the TF-IDF weighted bag of word vectors between the articles and questions. And if take the word order into account with n-gram features, the performence will be better. In the paper, useing bigram counts performed best. It used hashing of (Weinberger et al., 2009) to map the bigrams to \(2^{24}\) bins with an unsigned murmur3 hash to preserv speed and memory efficiency.

Document Reader

The Document Reader was consisted of a multi-layer BiLSTM and a RNN layer. The input first was processed by a RNN, and then a multi-layer BiLSTM.

Paragraph encoding was comprisied of the following pars:

Word embeddings:
- 300d Glove, only fine-tune the 1000 most frequent question words because the representations of some keu words such as what, how, whick, many could be crucial for QA systems.
Exact match:
- Three simple features, indicating whether \(p_i\) can be exactly matched to one question word in \(q\), either in its original, lowercase or lemma form. It's helpful as the ablation analysis.
Token features:
- POS, NER TF
Aligned question embedding:
- the embedding is actually an attention mechanism between question and paragraph. It was computed as following: (\(\alpha(\cdot)\) is a single dense layer with ReLU nonlinearity.)
  \[\begin{aligned}
  &a_{i,j} = \frac{exp(\alpha(E(p_i))\cdot \alpha(E(q_j)))}{\sum_{j^`}exp(\alpha(E(p_i)) \cdot \alpha(E(q_{j^`})))}\\
  &f_{align(p_i)} = \sum_j a_{i,j}E(q_j)
  \end{aligned}\]

Question encoding

Only apply a recurrent NN on top of word embedding of \(q_i\) and combine the resulting hidden units into one single vector: \(\{q_1, \cdots, q_l\} \rightarrow q\). The \(q\) was computed as following:
\[
\begin{aligned}
& b_j = \frac{exp(w \cdot q_j)}{\sum_{j^`} exp(w \cdot q_{j^`})}\\
& q = \sum_j b_jq_j
\end{aligned}
\]
where \(b_j\) encodes the importance of each question word. I think the computation is very similar with the question self attention.

Prediction

Take the \(p\) and \(q\) as input to train a classifier to predict the correct span positions.
\[\begin{aligned}
P_{start}(i) & \propto exp(p_iW_sq)\\
P_{end}(i) & \propto exp(p_iW_eq)
\end{aligned}
\]
Then select the best span from token \(i\) and token \(i^`\) such that \(i \leq i^` \leq i+15\) and \(P_{start}(i) \times P_{end}(i^`)\) is maximized.

Analysis

The ablation analysis result:

As the result showing, the aligned feature and exact_match feature are complementary and similar role as it does not matter when removing them respectively, but the performance drops dramatically wehn removing both of them.

Open-Domain QA -paper的更多相关文章

EasyMesh - A Two-Dimensional Quality Mesh Generator
EasyMesh - A Two-Dimensional Quality Mesh Generator eryar@163.com Abstract. EasyMesh is developed by ...
深度学习课程笔记（十七）Meta-learning (Model Agnostic Meta Learning)
深度学习课程笔记(十七)Meta-learning (Model Agnostic Meta Learning) 2018-08-09 12:21:33 The video tutorial can ...
(转) AdversarialNetsPapers
本文转自:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Pap ...
Official Program for CVPR 2015
From: http://www.pamitc.org/cvpr15/program.php Official Program for CVPR 2015 Monday, June 8 8:30am ...
生成对抗网络资源 Adversarial Nets Papers
来源:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Papers ...
RAC的QA
RAC: Frequently Asked Questions [ID 220970.1] 修改时间 13-JAN-2011 类型 FAQ 状态 PUBLISHED Appli ...
How to implement an algorithm from a scientific paper
Author: Emmanuel Goossaert 翻译 This article is a short guide to implementing an algorithm from a scie ...
paper 54 :图像频率的理解
我一直在思考一个问题,图像增强以后,哪些方面的特征最为显著,思来想去,无果而终!翻看了一篇知网的paper,基于保真度(VIF)的增强图像质量评价,文章中指出无参考质量评价,可以从三个方面考虑:平均梯 ...
如何写出优秀的研究论文 Chapter 1. How to Write an A+ Research Paper
This Chapter outlines the logical steps to writing a good research paper. To achieve supreme excelle ...

随机推荐

jquery script两个属性
今天使用jquery cdn时发现多了两个属性. <script src="http://code.jquery.com/jquery-2.2.4.min.js" i ...
ArcGIS Editor for Open Street Map 10.X for Desktop下载地址
ArcGIS Editor for Open Street Map可用于导入从OSM下载的地图,但并不是ArcGIS自带的工具,需要从官网下载,虽然文件很小,但下载速度较慢,易断开. 在此为找不到或不 ...
mysql常用操作（一）
[数据库设计的三大范式]1.第一范式(1NF):数据表中的每一列,必须是不可拆分的最小单元.也就是确保每一列的原子性. 例如:userInfo:'山东省烟台市 18865518189' 应拆分成 us ...
day 25-1 接口类、抽象类、多态
# 接口类:python 原生不支持# 抽象类:python 原生支持的接口类首先我们来看一个支付接口的简单例子 from abc import abstractmethod,ABCMeta #我 ...
3、if和while语句
a=1 b=2 if a<b: print("Yes") print("Yes") print("Yes") print(" ...
Eclipse 开发设置编码格式--4个修改地方完美
背景:本人用这么久,因为大部分都是设定为UTF-8 就可以了,但是一些老项目居然是GBK格式,所以工作空间.通常文件类型的编码都是UTF-8. 针对特殊项目设定特定格式,实际中本人对整个项目设定并不 ...
解决can't connect to redis-server
解决方案:编辑redis.conf配置文件:注释掉61行本地链接限制以及80行配置修改为no 61 # bind 127.0.0.1 80 protected-mode no 重启服务即可
自定义QGraphicsItem
简述: QGraphicsItem 是场景中 item 的基类.图形视图提供了一些典型形状的标准 item,例如:矩形 ( QGraphicsRectItem ).椭圆 ( QGraphicsElli ...
C# 断言 Assert
重构-断言现象:某一段代码需要对程序状态做出某种假设做法:以断言明确表现这种假设动机: 常常有这种一段代码:只有某个条件为真是,该改名才能正常运行. 通常假设这样的假设并没有代码中明确表现出来, ...
高可用Redis(五)：瑞士军刀之慢查询，Pipeline和发布订阅
1.慢查询 1.1 慢查询的生命周期步骤一:client通过网络向Redis发送一条命令步骤二:由于Redis是单线程应用,可以把Redis想像成一个队列,client执行的所有命令都在排队等着s ...

Open-Domain QA -paper