Open-Domain QA -paper
Open-domain QA
Overview
The whole system is consisted with Document Retriever and Document Reader. The Document Retriever returns top five Wikipedia articles given any question, then the Document Reader will process these articles.
Document Retriever
The Retriever compares the TF-IDF weighted bag of word vectors between the articles and questions. And if take the word order into account with n-gram features, the performence will be better. In the paper, useing bigram counts performed best. It used hashing of (Weinberger et al., 2009) to map the bigrams to \(2^{24}\) bins with an unsigned murmur3 hash to preserv speed and memory efficiency.
Document Reader
The Document Reader was consisted of a multi-layer BiLSTM and a RNN layer. The input first was processed by a RNN, and then a multi-layer BiLSTM.
Paragraph encoding was comprisied of the following pars:
- Word embeddings:
- 300d Glove, only fine-tune the 1000 most frequent question words because the representations of some keu words such as what, how, whick, many could be crucial for QA systems.
- Exact match:
- Three simple features, indicating whether \(p_i\) can be exactly matched to one question word in \(q\), either in its original, lowercase or lemma form. It's helpful as the ablation analysis.
- Token features:
- POS, NER TF
- Aligned question embedding:
- the embedding is actually an attention mechanism between question and paragraph. It was computed as following: (\(\alpha(\cdot)\) is a single dense layer with ReLU nonlinearity.)
\[\begin{aligned}
&a_{i,j} = \frac{exp(\alpha(E(p_i))\cdot \alpha(E(q_j)))}{\sum_{j^`}exp(\alpha(E(p_i)) \cdot \alpha(E(q_{j^`})))}\\
&f_{align(p_i)} = \sum_j a_{i,j}E(q_j)
\end{aligned}\]
- the embedding is actually an attention mechanism between question and paragraph. It was computed as following: (\(\alpha(\cdot)\) is a single dense layer with ReLU nonlinearity.)
Question encoding
Only apply a recurrent NN on top of word embedding of \(q_i\) and combine the resulting hidden units into one single vector: \(\{q_1, \cdots, q_l\} \rightarrow q\). The \(q\) was computed as following:
\[
\begin{aligned}
& b_j = \frac{exp(w \cdot q_j)}{\sum_{j^`} exp(w \cdot q_{j^`})}\\
& q = \sum_j b_jq_j
\end{aligned}
\]
where \(b_j\) encodes the importance of each question word. I think the computation is very similar with the question self attention.
Prediction
Take the \(p\) and \(q\) as input to train a classifier to predict the correct span positions.
\[\begin{aligned}
P_{start}(i) & \propto exp(p_iW_sq)\\
P_{end}(i) & \propto exp(p_iW_eq)
\end{aligned}
\]
Then select the best span from token \(i\) and token \(i^`\) such that \(i \leq i^` \leq i+15\) and \(P_{start}(i) \times P_{end}(i^`)\) is maximized.
Analysis
The ablation analysis result:

As the result showing, the aligned feature and exact_match feature are complementary and similar role as it does not matter when removing them respectively, but the performance drops dramatically wehn removing both of them.
Open-Domain QA -paper的更多相关文章
- EasyMesh - A Two-Dimensional Quality Mesh Generator
EasyMesh - A Two-Dimensional Quality Mesh Generator eryar@163.com Abstract. EasyMesh is developed by ...
- 深度学习课程笔记(十七)Meta-learning (Model Agnostic Meta Learning)
深度学习课程笔记(十七)Meta-learning (Model Agnostic Meta Learning) 2018-08-09 12:21:33 The video tutorial can ...
- (转) AdversarialNetsPapers
本文转自:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Pap ...
- Official Program for CVPR 2015
From: http://www.pamitc.org/cvpr15/program.php Official Program for CVPR 2015 Monday, June 8 8:30am ...
- 生成对抗网络资源 Adversarial Nets Papers
来源:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Papers ...
- RAC的QA
RAC: Frequently Asked Questions [ID 220970.1] 修改时间 13-JAN-2011 类型 FAQ 状态 PUBLISHED Appli ...
- How to implement an algorithm from a scientific paper
Author: Emmanuel Goossaert 翻译 This article is a short guide to implementing an algorithm from a scie ...
- paper 54 :图像频率的理解
我一直在思考一个问题,图像增强以后,哪些方面的特征最为显著,思来想去,无果而终!翻看了一篇知网的paper,基于保真度(VIF)的增强图像质量评价,文章中指出无参考质量评价,可以从三个方面考虑:平均梯 ...
- 如何写出优秀的研究论文 Chapter 1. How to Write an A+ Research Paper
This Chapter outlines the logical steps to writing a good research paper. To achieve supreme excelle ...
随机推荐
- MyEclipse打不开 报xxxxxx. log。
1.找到MyEcliPse安装目录下configuration文件夹 打开 2.删除org.eclipse.update这个文件夹 3.再打开org.eclipse.osgi/.manager 4.删 ...
- redhat7.4切换yum源为免费源
1.redhat是Linux系统中付费的企业版,虽然安装什么是免费的,但是需要注册. 如果你有注册码,暂请出门左拐(我没有注册码,所以我也不会注册,不用往下看了). Linux系统收费版:RedHat ...
- 提高MYSQL大数据量查询的速度
1.对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引. 2.应尽量避免在 where 子句中对字段进行 null 值判断,否则将导致引擎放弃使用索 ...
- Win2012 R2安装 mysql8.0
1.官网下载安装 官网上面写着x86,其实是兼容x64和x86的,下载安装就行 2.安装navicat 3.navicat连接mysql的时候出现错误 client does not support ...
- 使用window.performance分析web前端性能
参考链接:https://blog.csdn.net/lovenjoe/article/details/80260658
- DC综合简单总结(2)
DC综合简单总结(2) 建立时间和保持时间和数据输出延时时间 一.概念 建立时间和保持时间都是针对触发器的特性说的. 建立时间(Tsu:set up time) 是指在触发器的时钟信号上升沿到来以前, ...
- TextView设置不同字段不同点击事件
转载自:http://www.apkbus.com/blog-160625-59265.html package com.example.fortextdemo; import java.util ...
- 【easy】784. Letter Case Permutation
Examples: Input: S = "a1b2" Output: ["a1b2", "a1B2", "A1b2", ...
- 爬虫 requests模块的其他用法 抽屉网线程池回调爬取+保存实例,gihub登陆实例
requests模块的其他用法 #通常我们在发送请求时都需要带上请求头,请求头是将自身伪装成浏览器的关键,常见的有用的请求头如下 Host Referer #大型网站通常都会根据该参数判断请求的来源 ...
- nginx的location配置详解
语法规则: location [=|~|~*|^~] /uri/ { … } =开头表示精确匹配 ^~开头表示uri以某个常规字符串开头,理解为匹配url路径即可.nginx不对url做编码,因此请求 ...