100篇必读的NLP论文

100 Must-Read NLP
自己汇总的论文集,已更新
链接:https://pan.baidu.com/s/16k2s2HYfrKHLBS5lxZIkuw
提取码:x7tn

This is a list of 100 important natural language processing (NLP) papers that serious students and researchers working in the field should probably know about and read.

这是100篇重要的自然语言处理(NLP)论文的列表,认真的学生和研究人员在这个领域应该知道和阅读。

This list is compiled by Masato Hagiwara.

本榜单由Masato Hagiwara编制。

I welcome any feedback on this list. 我欢迎对这个列表的任何反馈。 This list is originally based on the answers for a Quora question I posted years ago: What are the most important research papers which all NLP studnets should definitely read?.

这个列表最初是基于我多年前在Quora上发布的一个问题的答案:所有NLP学生都应该阅读的最重要的研究论文是什么?

I thank all the people who contributed to the original post. 我感谢所有为原创文章做出贡献的人。

This list is far from complete or objective, and is evolving, as important papers are being published year after year.

由于重要的论文年复一年地发表,这份清单还远远不够完整和客观,而且还在不断发展。

Please let me know via pull requests and issues if anything is missing.

请通过pull requestsissues告诉我是否有任何遗漏。

Also, I didn't try to include links to original papers since it is a lot of work to keep dead links up to date.

此外,我没有试图包括原始论文的链接,因为保持死链接是大量的工作,直到最新。

I'm sure you can find most (if not all) of the papers listed here via a single Google search by their titles.

我相信你可以通过一个简单的谷歌搜索找到这里列出的大部分(如果不是全部)论文。

A paper doesn't have to be a peer-reviewed conference/journal paper to appear here.

一篇论文不一定要经过同行评审的会议/期刊论文才能出现在这里。

We also include tutorial/survey-style papers and blog posts that are often easier to understand than the original papers.

我们还包括教程/调查风格的论文和博客文章,通常比原来的论文更容易理解。

Machine Learning

  • Avrim Blum and Tom Mitchell: Combining Labeled and Unlabeled Data with Co-Training, 1998.
  • John Lafferty, Andrew McCallum, Fernando C.N. Pereira: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, ICML 2001.
  • Charles Sutton, Andrew McCallum. An Introduction to Conditional Random Fields for Relational Learning.
  • Kamal Nigam, et al.: Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning, 1999.
  • Kevin Knight: Bayesian Inference with Tears, 2009.
  • Marco Tulio Ribeiro et al.: "Why Should I Trust You?": Explaining the Predictions of Any Classifier, KDD 2016.

Neural Models

  • Richard Socher, et al.: Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection, NIPS 2011.
  • Ronan Collobert et al.: Natural Language Processing (almost) from Scratch, J. of Machine Learning Research, 2011.
  • Richard Socher, et al.: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, EMNLP 2013.
  • Xiang Zhang, Junbo Zhao, and Yann LeCun: Character-level Convolutional Networks for Text Classification, NIPS 2015.
  • Yoon Kim: Convolutional Neural Networks for Sentence Classification, 2014.
  • Christopher Olah: Understanding LSTM Networks, 2015.
  • Matthew E. Peters, et al.: Deep contextualized word representations, 2018.
  • Jacob Devlin, et al.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2018.

Clustering & Word Embeddings

集群和词嵌入

  • Peter F Brown, et al.: Class-Based n-gram Models of Natural Language, 1992.

    基于类的n-gram自然语言模型

  • Tomas Mikolov, et al.: Efficient Estimation of Word Representations in Vector Space, 2013.

    向量空间中字表示的有效估计

  • Tomas Mikolov, et al.: Distributed Representations of Words and Phrases and their Compositionality, NIPS 2013.

    单词和短语的分布式表示及其组合性

  • Quoc V. Le and Tomas Mikolov: Distributed Representations of Sentences and Documents, 2014.

    分布式句子和文档的表示形式

  • Jeffrey Pennington, et al.: GloVe: Global Vectors for Word Representation, 2014.

    词表示的全局向量

  • Ryan Kiros, et al.: Skip-Thought Vectors, 2015.

    Skip-Thought 向量

  • Piotr Bojanowski, et al.: Enriching Word Vectors with Subword Information, 2017.

    用子单词信息丰富单词向量

Topic Models

  • Thomas Hofmann: Probabilistic Latent Semantic Indexing, SIGIR 1999.
  • David Blei, Andrew Y. Ng, and Michael I. Jordan: Latent Dirichlet Allocation, J. Machine Learning Research, 2003.

Language Modeling

  • Joshua Goodman: A bit of progress in language modeling, MSR Technical Report, 2001.
  • Stanley F. Chen and Joshua Goodman: An Empirical Study of Smoothing Techniques for Language Modeling, ACL 2006.
  • Yee Whye Teh: A Hierarchical Bayesian Language Model based on Pitman-Yor Processes, COLING/ACL 2006.
  • Yee Whye Teh: A Bayesian interpretation of Interpolated Kneser-Ney, 2006.
  • Yoshua Bengio, et al.: A Neural Probabilistic Language Model, J. of Machine Learning Research, 2003.
  • Andrej Karpathy: The Unreasonable Effectiveness of Recurrent Neural Networks, 2015.
  • Yoon Kim, et al.: Character-Aware Neural Language Models, 2015.

Segmentation, Tagging, Parsing

  • Donald Hindle and Mats Rooth. Structural Ambiguity and Lexical Relations, Computational Linguistics, 1993.
  • Adwait Ratnaparkhi: A Maximum Entropy Model for Part-Of-Speech Tagging, EMNLP 1996.
  • Eugene Charniak: A Maximum-Entropy-Inspired Parser, NAACL 2000.
  • Michael Collins: Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, EMNLP 2002.
  • Dan Klein and Christopher Manning: Accurate Unlexicalized Parsing, ACL 2003.
  • Dan Klein and Christopher Manning: Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency, ACL 2004.
  • Joakim Nivre and Mario Scholz: Deterministic Dependency Parsing of English Text, COLING 2004.
  • Ryan McDonald et al.: Non-Projective Dependency Parsing using Spanning-Tree Algorithms, EMNLP 2005.
  • Daniel Andor et al.: Globally Normalized Transition-Based Neural Networks, 2016.
  • Oriol Vinyals, et al.: Grammar as a Foreign Language, 2015.

Sequential Labeling & Information Extraction

  • Marti A. Hearst: Automatic Acquisition of Hyponyms from Large Text Corpora, COLING 1992.
  • Collins and Singer: Unsupervised Models for Named Entity Classification, EMNLP 1999.
  • Patrick Pantel and Dekang Lin, Discovering Word Senses from Text, SIGKDD, 2002.
  • Mike Mintz et al.: Distant supervision for relation extraction without labeled data, ACL 2009.
  • Zhiheng Huang et al.: Bidirectional LSTM-CRF Models for Sequence Tagging, 2015.
  • Xuezhe Ma and Eduard Hovy: End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, ACL 2016.

Machine Translation & Transliteration, Sequence-to-Sequence Models

  • Peter F. Brown et al.: A Statistical Approach to Machine Translation, Computational Linguistics, 1990.
  • Kevin Knight, Graehl Jonathan. Machine Transliteration. Computational Linguistics, 1992.
  • Dekai Wu: Inversion Transduction Grammars and the Bilingual Parsing of Parallel Corpora, Computational Linguistics, 1997.
  • Kevin Knight: A Statistical MT Tutorial Workbook, 1999.
  • Kishore Papineni, et al.: BLEU: a Method for Automatic Evaluation of Machine Translation, ACL 2002.
  • Philipp Koehn, Franz J Och, and Daniel Marcu: Statistical Phrase-Based Translation, NAACL 2003.
  • Philip Resnik and Noah A. Smith: The Web as a Parallel Corpus, Computational Linguistics, 2003.
  • Franz J Och and Hermann Ney: The Alignment-Template Approach to Statistical Machine Translation, Computational Linguistics, 2004.
  • David Chiang. A Hierarchical Phrase-Based Model for Statistical Machine Translation, ACL 2005.
  • Ilya Sutskever, Oriol Vinyals, and Quoc V. Le: Sequence to Sequence Learning with Neural Networks, NIPS 2014.
  • Oriol Vinyals, Quoc Le: A Neural Conversation Model, 2015.
  • Dzmitry Bahdanau, et al.: Neural Machine Translation by Jointly Learning to Align and Translate, 2014.
  • Minh-Thang Luong, et al.: Effective Approaches to Attention-based Neural Machine Translation, 2015.
  • Rico Sennrich et al.: Neural Machine Translation of Rare Words with Subword Units. ACL 2016.
  • Yonghui Wu, et al.: Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, 2016.
  • Jonas Gehring, et al.: Convolutional Sequence to Sequence Learning, 2017.
  • Ashish Vaswani, et al.: Attention Is All You Need, 2017.

Coreference Resolution

  • Vincent Ng: Supervised Noun Phrase Coreference Research: The First Fifteen Years, ACL 2010.
  • Kenton Lee at al.: End-to-end Neural Coreference Resolution, EMNLP 2017.

Automatic Text Summarization

  • Kevin Knight and Daniel Marcu: Summarization beyond sentence extraction. Artificial Intelligence 139, 2002.
  • James Clarke and Mirella Lapata: Modeling Compression with Discourse Constraints. EMNLP-CONLL 2007.
  • Ryan McDonald: A Study of Global Inference Algorithms in Multi-Document Summarization, ECIR 2007.
  • Wen-tau Yih et al.: Multi-Document Summarization by Maximizing Informative Content-Words. IJCAI 2007.
  • Alexander M Rush, et al.: A Neural Attention Model for Sentence Summarization. EMNLP 2015.

Question Answering and Machine Comprehension

  • Pranav Rajpurkar et al.: SQuAD: 100,000+ Questions for Machine Comprehension of Text. EMNLP 2015.
  • Minjoon Soo et al.: Bi-Directional Attention Flow for Machine Comprehension. ICLR 2015.

Generation, Reinforcement Learning

  • Jiwei Li, et al.: Deep Reinforcement Learning for Dialogue Generation, EMNLP 2016.
  • Marc’Aurelio Ranzato et al.: Sequence Level Training with Recurrent Neural Networks. ICLR 2016.
  • Lantao Yu, et al.: SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient, AAAI 2017.

研究NLP100篇必读的论文---已整理可直接下载的更多相关文章

  1. (zhuan) 126 篇殿堂级深度学习论文分类整理 从入门到应用

    126 篇殿堂级深度学习论文分类整理 从入门到应用 | 干货 雷锋网 作者: 三川 2017-03-02 18:40:00 查看源网址 阅读数:66 如果你有非常大的决心从事深度学习,又不想在这一行打 ...

  2. 如何在两个月的时间内发表一篇EI/SCI论文-我的时间管理心得

    在松松垮垮的三年研究生时期,要说有点像样的成果,也只有我的小论文可以谈谈了.可能有些厉害的角色研究生是丰富而多彩的,而大多数的同学在研究生阶段可能同我一样,是慢悠悠的渡过的,而且可能有的还不如我,我还 ...

  3. AI领域:如何做优秀研究并写高水平论文?

    来源:深度强化学习实验室 每个人从本科到硕士,再到博士.博士后,甚至工作以后,都会遇到做研究.写论文这个差事.论文通常是对现有工作的一个总结和展示,特别对于博士和做研究的人来说,论文则显得更加重要. ...

  4. itemKNN发展史----推荐系统的三篇重要的论文解读

    itemKNN发展史----推荐系统的三篇重要的论文解读 本文用到的符号标识 1.Item-based CF 基本过程: 计算相似度矩阵 Cosine相似度 皮尔逊相似系数 参数聚合进行推荐 根据用户 ...

  5. Chrome 开发者工具(DevTools)中所有快捷方式列表(已整理)

    Chrome 开发者工具(DevTools)中所有快捷方式列表(已整理) 前言 Chrome DevTools提供了一些内置的快捷键,开发者利用这些快捷键可以节省常工作中很多日的开发时间.下面列出了每 ...

  6. SLAM架构的两篇顶会论文解析

    SLAM架构的两篇顶会论文解析 一. 基于superpoint的词袋和图验证的鲁棒闭环检测 标题:Robust Loop Closure Detection Based on Bag of Super ...

  7. O2O研究系列——O2O知识思维导图整理

    本篇文章对O2O电子商务模式的常规知识点,使用思维导图的方式整理,表达的形式是名词纲领性的方式, 不会在图中详细说明各个点. 通过这个图研究O2O模式时,可以系统的对各个业务点进行更深入的研究,避免有 ...

  8. Fast-RCNN论文总结整理

    此篇博客写作思路是一边翻译英文原文一边总结博主在阅读过程中遇到的问题及一些思考,因为博主本人阅读英文论文水平不高,所以还请大家在看此篇博客的过程中带着批判的眼神阅读!小墨镜带好,有什么不对的地方请在留 ...

  9. 《OAuth2.0协议安全形式化分析-》----论文摘抄整理

    ---恢复内容开始--- 本篇论文发表在计算机工程与设计,感觉写的还是很有水准的.实验部分交代的比较清楚 本篇论文的创新点: 使用Scyther工具 主要是在 DY模型下面 形式化分析了 OAuth2 ...

随机推荐

  1. 标准库模块——json模块

    将Python数据类型转换为其他代码格式叫做(序列化),而json就是在各个代码实现转换的中间件. 序列化要求: 1. 只能有int,str,bool,list,dict,tuple的类型支持序列化. ...

  2. APP分享视频H5页面

    男左女右中国APP需要做一个APP分享视频H5页面,效果图见下面的图. 出现的问题: (1)URL参数为中文的时候乱码: (2)vedio点击默认是QQ,微信的播放器: (3)给视频添加一个默认的封面 ...

  3. 51nod 1393:0和1相等串

    1393 0和1相等串 基准时间限制:1 秒 空间限制:131072 KB 分值: 20 难度:3级算法题  收藏  关注 给定一个0-1串,请找到一个尽可能长的子串,其中包含的0与1的个数相等. I ...

  4. 洛谷 P2658 汽车拉力比赛

    题目传送门 解题思路: 二分答案,然后bfs验证,如果从一个路标可以达到其它所有路标,则答案可行.知道找到最佳答案. AC代码: #include<iostream> #include&l ...

  5. CF940F Machine Learning(带修莫队)

    首先显然应该把数组离散化,然后发现是个带修莫队裸题,但是求mex比较讨厌,怎么办?其实可以这样求:记录每个数出现的次数,以及出现次数的出现次数.至于求mex,直接暴力扫最小的出现次数的出现次数为0的正 ...

  6. [SDOI2016]游戏(树剖+李超树)

    趁着我把李超树忘个一干二净的时候来复习一下吧,毕竟马上NOI了. 题解:看着那个dis就很不爽,直接把它转换成深度问题,然后一条直线x->y,假设其lca为z,可以拆分成x->z和z-&g ...

  7. Mac OS 终端利器 iTerm2配置大全

    之前一直使用 Mac OS 自带的终端,用起来虽然有些不太方便,但总体来说还是可以接受的,是有想换个终端的想法,然后今天偶然看到一个终端利器 iTerm2,发现真的很强大,也非常的好用,按照网上配置了 ...

  8. PAT Advanced 1147 Heaps (30) [堆,树的遍历]

    题目 In computer science, a heap is a specialized tree-based data structure that satisfies the heap pr ...

  9. Ubuntu16.04 + ROS下串口通讯

    本文参考https://blog.csdn.net/weifengdq/article/details/84374690 由于工程需要,需要Ubuntu16.04 + ROS与STM32通讯,主要有两 ...

  10. JavaScript学习总结(八)

    这一节结束,我们的JavaScript学习总结系列文章第一阶段就要结束了,今后会适当的补充一些高级的内容,敬请期待. 好了,废话不说进入这一节的学习. 联动框 联动框,实在是太常见了.比如淘宝,我们选 ...