深度学习阅读列表 Deep Learning Reading List
Reading List
List of reading lists and survey papers:
Books
- Deep Learning, Yoshua Bengio, Ian Goodfellow, Aaron Courville, MIT Press, In preparation.
Review Papers
- Representation Learning: A Review and New Perspectives, Yoshua Bengio, Aaron Courville, Pascal Vincent, Arxiv, 2012.
- The monograph or review paper Learning Deep Architectures for AI (Foundations & Trends in Machine Learning, 2009).
- Deep Machine Learning – A New Frontier in Artificial Intelligence Research – a survey paper by Itamar Arel, Derek C. Rose, and Thomas P. Karnowski.
- Graves, A. (2012). Supervised sequence labelling with recurrent neural networks(Vol. 385). Springer.
- Schmidhuber, J. (2014). Deep Learning in Neural Networks: An Overview. 75 pages, 850+ references, http://arxiv.org/abs/1404.7828, PDF & LATEX source & complete public BIBTEX file under http://www.idsia.ch/~juergen/deep-learning-overview.html.
- LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” Nature 521, no. 7553 (2015): 436-444.
Reinforcement Learning
- Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. “Playing Atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).
- Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu. “Recurrent Models of Visual Attention” ArXiv e-print, 2014.
Computer Vision
- ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012.
- Going Deeper with Convolutions, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, 19-Sept-2014.
- Learning Hierarchical Features for Scene Labeling, Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.
- Learning Convolutional Feature Hierachies for Visual Recognition, Koray Kavukcuoglu, Pierre Sermanet, Y-Lan Boureau, Karol Gregor, Michaël Mathieu and Yann LeCun, Advances in Neural Information Processing Systems (NIPS 2010), 23, 2010.
- Graves, Alex, et al. “A novel connectionist system for unconstrained handwriting recognition.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.5 (2009): 855-868.
- Cireşan, D. C., Meier, U., Gambardella, L. M., & Schmidhuber, J. (2010). Deep, big, simple neural nets for handwritten digit recognition. Neural computation, 22(12), 3207-3220.
- Ciresan, Dan, Ueli Meier, and Jürgen Schmidhuber. “Multi-column deep neural networks for image classification.”Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
- Ciresan, D., Meier, U., Masci, J., & Schmidhuber, J. (2011, July). A committee of neural networks for traffic sign classification. In Neural Networks (IJCNN), The 2011 International Joint Conference on (pp. 1918-1921). IEEE.
NLP and Speech
- Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing, Antoine Bordes, Xavier Glorot, Jason Weston and Yoshua Bengio (2012), in: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS)
- Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Socher, R., Huang, E. H., Pennington, J., Ng, A. Y., and Manning, C. D. (2011a). In NIPS’2011.
- Semi-supervised recursive autoencoders for predicting sentiment distributions. Socher, R., Pennington, J., Huang, E. H., Ng, A. Y., and Manning, C. D. (2011b). In EMNLP’2011.
- Mikolov Tomáš: Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.
- Graves, Alex, and Jürgen Schmidhuber. “Framewise phoneme classification with bidirectional LSTM and other neural network architectures.” Neural Networks 18.5 (2005): 602-610.
- Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. “Distributed representations of words and phrases and their compositionality.” In Advances in Neural Information Processing Systems, pp. 3111-3119. 2013.
- K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. EMNLP 2014.
- Sutskever, Ilya, Oriol Vinyals, and Quoc VV Le. “Sequence to sequence learning with neural networks.” Advances in Neural Information Processing Systems. 2014.
Disentangling Factors and Variations with Depth
- Goodfellow, Ian, et al. “Measuring invariances in deep networks.” Advances in neural information processing systems 22 (2009): 646-654.
- Bengio, Yoshua, et al. “Better Mixing via Deep Representations.” arXiv preprint arXiv:1207.4404 (2012).
- Xavier Glorot, Antoine Bordes and Yoshua Bengio, Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach, in: Proceedings of the Twenty-eight International Conference on Machine Learning (ICML’11), pages 97-110, 2011.
Transfer Learning and domain adaptation
- Raina, Rajat, et al. “Self-taught learning: transfer learning from unlabeled data.” Proceedings of the 24th international conference on Machine learning. ACM, 2007.
- Xavier Glorot, Antoine Bordes and Yoshua Bengio, Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach, in: Proceedings of the Twenty-eight International Conference on Machine Learning (ICML’11), pages 97-110, 2011.
- R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu and P. Kuksa. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 12:2493-2537, 2011.
- Mesnil, Grégoire, et al. “Unsupervised and transfer learning challenge: a deep learning approach.” Unsupervised and Transfer Learning Workshop, in conjunction with ICML. 2011.
- Ciresan, D. C., Meier, U., & Schmidhuber, J. (2012, June). Transfer learning for Latin and Chinese characters with deep neural networks. In Neural Networks (IJCNN), The 2012 International Joint Conference on (pp. 1-6). IEEE.
- Goodfellow, Ian, Aaron Courville, and Yoshua Bengio. “Large-Scale Feature Learning With Spike-and-Slab Sparse Coding.” ICML 2012.
Practical Tricks and Guides
- “Improving neural networks by preventing co-adaptation of feature detectors.” Hinton, Geoffrey E., et al. arXiv preprint arXiv:1207.0580 (2012).
- Practical recommendations for gradient-based training of deep architectures, Yoshua Bengio, U. Montreal, arXiv report:1206.5533, Lecture Notes in Computer Science Volume 7700, Neural Networks: Tricks of the Trade Second Edition, Editors: Grégoire Montavon, Geneviève B. Orr, Klaus-Robert Müller, 2012.
- A practical guide to training Restricted Boltzmann Machines, by Geoffrey Hinton.
Sparse Coding
- Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Bruno Olhausen, Nature 1996.
- Kavukcuoglu, Koray, Marc’Aurelio Ranzato, and Yann LeCun. “Fast inference in sparse coding algorithms with applications to object recognition.” arXiv preprint arXiv:1010.3467 (2010).
- Goodfellow, Ian, Aaron Courville, and Yoshua Bengio. “Large-Scale Feature Learning With Spike-and-Slab Sparse Coding.” ICML 2012.
- Efficient sparse coding algorithms. Honglak Lee, Alexis Battle, Raina Rajat and Andrew Y. Ng. In NIPS 19, 2007. pdf
- “Sparse coding with an overcomplete basis set: A strategy employed by VI?.” . Olshausen, Bruno A., and David J. Field. Vision research 37.23 (1997): 3311-3326.
Foundation Theory and Motivation
- Hinton, Geoffrey E. “Deterministic Boltzmann learning performs steepest descent in weight-space.” Neural computation 1.1 (1989): 143-150.
- Bengio, Yoshua, and Samy Bengio. “Modeling high-dimensional discrete data with multi-layer neural networks.” Advances in Neural Information Processing Systems 12 (2000): 400-406.
- Bengio, Yoshua, et al. “Greedy layer-wise training of deep networks.” Advances in neural information processing systems 19 (2007): 153.
- Bengio, Yoshua, Martin Monperrus, and Hugo Larochelle. “Nonlocal estimation of manifold structure.” Neural Computation 18.10 (2006): 2509-2528.
- Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006): 504-507.
- Marc’Aurelio Ranzato, Y., Lan Boureau, and Yann LeCun. “Sparse feature learning for deep belief networks.” Advances in neural information processing systems 20 (2007): 1185-1192.
- Bengio, Yoshua, and Yann LeCun. “Scaling learning algorithms towards AI.” Large-Scale Kernel Machines 34 (2007).
- Le Roux, Nicolas, and Yoshua Bengio. “Representational power of restricted boltzmann machines and deep belief networks.” Neural Computation 20.6 (2008): 1631-1649.
- Sutskever, Ilya, and Geoffrey Hinton. “Temporal-Kernel Recurrent Neural Networks.” Neural Networks 23.2 (2010): 239-243.
- Le Roux, Nicolas, and Yoshua Bengio. “Deep belief networks are compact universal approximators.” Neural computation 22.8 (2010): 2192-2207.
- Bengio, Yoshua, and Olivier Delalleau. “On the expressive power of deep architectures.” Algorithmic Learning Theory. Springer Berlin/Heidelberg, 2011.
- Montufar, Guido F., and Jason Morton. “When Does a Mixture of Products Contain a Product of Mixtures?.” arXiv preprint arXiv:1206.0387 (2012).
- Montúfar, Guido, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. “On the Number of Linear Regions of Deep Neural Networks.” arXiv preprint arXiv:1402.1869 (2014).
Supervised Feedfoward Neural Networks
- The Manifold Tangent Classifier, Salah Rifai, Yann Dauphin, Pascal Vincent, Yoshua Bengio and Xavier Muller, in: NIPS’2011.
- “Discriminative Learning of Sum-Product Networks.“, Gens, Robert, and Pedro Domingos, NIPS 2012 Best Student Paper.
- Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013). Maxout networks. Technical Report, Universite de Montreal.
- Hinton, Geoffrey E., et al. “Improving neural networks by preventing co-adaptation of feature detectors.” arXiv preprint arXiv:1207.0580 (2012).
- Wang, Sida, and Christopher Manning. “Fast dropout training.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 118-126. 2013.
- Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. “Deep sparse rectifier networks.” In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume, vol. 15, pp. 315-323. 2011.
- ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012.
Large Scale Deep Learning
- Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng, ICML 2012.
- Bengio, Yoshua, et al. “Neural probabilistic language models.” Innovations in Machine Learning (2006): 137-186. Specifically Section 3 of this paper discusses the asynchronous SGD.
- Dean, Jeffrey, et al. “Large scale distributed deep networks.” Advances in Neural Information Processing Systems. 2012.
Recurrent Networks
- Training Recurrent Neural Networks, Ilya Sutskever, PhD Thesis, 2012.
- Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. “Learning long-term dependencies with gradient descent is difficult.” Neural Networks, IEEE Transactions on 5.2 (1994): 157-166.
- Mikolov Tomáš: Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.
- Hochreiter, Sepp, and Jürgen Schmidhuber.“Long short-term memory.” Neural computation 9.8 (1997): 1735-1780.
- Hochreiter, S., Bengio, Y., Frasconi, P., & Schmidhuber, J. (2001). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies.
- Schmidhuber, J. (1992). Learning complex, extended sequences using the principle of history compression.Neural Computation, 4(2), 234-242.
- Graves, A., Fernández, S., Gomez, F., & Schmidhuber, J. (2006, June). Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning (pp. 369-376). ACM.
Hyper Parameters
- “Practical Bayesian Optimization of Machine Learning Algorithms”, Jasper Snoek, Hugo Larochelle, Ryan Adams, NIPS 2012.
- Random Search for Hyper-Parameter Optimization, James Bergstra and Yoshua Bengio (2012), in: Journal of Machine Learning Research, 13(281–305).
- Algorithms for Hyper-Parameter Optimization, James Bergstra, Rémy Bardenet, Yoshua Bengio and Balázs Kégl, in: NIPS’2011, 2011.
Optimization
- Training Deep and Recurrent Neural Networks with Hessian-Free Optimization, James Martens and Ilya Sutskever, Neural Networks: Tricks of the Trade, 2012.
- Schaul, Tom, Sixin Zhang, and Yann LeCun. “No More Pesky Learning Rates.”arXiv preprint arXiv:1206.1106 (2012).
- Le Roux, Nicolas, Pierre-Antoine Manzagol, and Yoshua Bengio. “Topmoumoute online natural gradient algorithm.” Neural Information Processing Systems (NIPS). 2007.
- Bordes, Antoine, Léon Bottou, and Patrick Gallinari. “SGD-QN: Careful quasi-Newton stochastic gradient descent.” The Journal of Machine Learning Research 10 (2009): 1737-1754.
- Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks.” Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’10). Society for Artificial Intelligence and Statistics. 2010.
- Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. “Deep Sparse Rectifier Networks.” Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume. Vol. 15. 2011.
- “Deep learning via Hessian-free optimization.”Martens, James. Proceedings of the 27th International Conference on Machine Learning (ICML). Vol. 951. 2010.
- Hochreiter, Sepp, and Jürgen Schmidhuber. “Flat minima.” Neural Computation, 9.1 (1997): 1-42.
- Pascanu, Razvan, and Yoshua Bengio. “Revisiting natural gradient for deep networks.” arXiv preprint arXiv:1301.3584 (2013).
- Dauphin, Yann N., Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. “Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.” In Advances in Neural Information Processing Systems, pp. 2933-2941. 2014.
Unsupervised Feature Learning
- Salakhutdinov, Ruslan, and Geoffrey E. Hinton. “Deep boltzmann machines.” Proceedings of the international conference on artificial intelligence and statistics. Vol. 5. No. 2. Cambridge, MA: MIT Press, 2009.
- Scholarpedia page on Deep Belief Networks.
Deep Boltzmann Machines
- An Efficient Learning Procedure for Deep Boltzmann Machines, Ruslan Salakhutdinov and Geoffrey Hinton, Neural Computation August 2012, Vol. 24, No. 8: 1967 — 2006.
- Montavon, Grégoire, and Klaus-Robert Müller. “Deep Boltzmann Machines and the Centering Trick.” Neural Networks: Tricks of the Trade (2012): 621-637.
- Salakhutdinov, Ruslan, and Hugo Larochelle. “Efficient learning of deep boltzmann machines.” International Conference on Artificial Intelligence and Statistics. 2010.
- Salakhutdinov, Ruslan. Learning deep generative models. Diss. University of Toronto, 2009.
- Goodfellow, Ian, et al. “Multi-prediction deep Boltzmann machines.” Advances in Neural Information Processing Systems. 2013.
RBMs
- Unsupervised Models of Images by Spike-and-Slab RBMs, Aaron Courville, James Bergstra and Yoshua Bengio, in: ICML’2011
- Hinton, Geoffrey. “A practical guide to training restricted Boltzmann machines.” Momentum 9.1 (2010): 926.
Autoencoders
- Regularized Auto-Encoders Estimate Local Statistics, Guillaume Alain, Yoshua Bengio and Salah Rifai, Université de Montréal, arXiv report 1211.4246, 2012
- A Generative Process for Sampling Contractive Auto-Encoders, Salah Rifai, Yoshua Bengio, Yann Dauphin and Pascal Vincent, in: ICML’2012, Edinburgh, Scotland, U.K., 2012
- Contracting Auto-Encoders: Explicit invariance during feature extraction, Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot and Yoshua Bengio, in: ICML’2011
- Disentangling factors of variation for facial expression recognition, Salah Rifai, Yoshua Bengio, Aaron Courville, Pascal Vincent and Mehdi Mirza, in: ECCV’2012.
- Vincent, Pascal, et al. “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.” The Journal of Machine Learning Research 11 (2010): 3371-3408.
- Vincent, Pascal. “A connection between score matching and denoising autoencoders.” Neural computation 23.7 (2011): 1661-1674.
- Chen, Minmin, et al. “Marginalized denoising autoencoders for domain adaptation.” arXiv preprint arXiv:1206.4683 (2012).
Miscellaneous
- The ICML 2009 Workshop on Learning Feature Hierarchies webpage has a reading list.
- Stanford’s UFLDL Recommended Readings.
- The LISApublic wiki has a reading list and a bibliography.
- Geoff Hinton has readings NIPS 2007 tutorial.
- The LISA publications database contains a deep architectures category.
- A very brief introduction to AI, Machine Learning, and Deep Learning in Yoshua Bengio‘s IFT6266 graduate class
- Memkite’s deep learning reading list, http://memkite.com/deep-learning-bibliography/.
- Deep learning resources page, http://www.jeremydjacksonphd.com/?cat=7
from: http://deeplearning.net/reading-list/
深度学习阅读列表 Deep Learning Reading List的更多相关文章
- 贝叶斯深度学习(bayesian deep learning)
本文简单介绍什么是贝叶斯深度学习(bayesian deep learning),贝叶斯深度学习如何用来预测,贝叶斯深度学习和深度学习有什么区别.对于贝叶斯深度学习如何训练,本文只能大致给个介绍. ...
- 深度学习概述教程--Deep Learning Overview
引言 深度学习,即Deep Learning,是一种学习算法(Learning algorithm),亦是人工智能领域的一个重要分支.从快速发展到实际应用,短短几年时间里, ...
- 深度学习加速器堆栈Deep Learning Accelerator Stack
深度学习加速器堆栈Deep Learning Accelerator Stack 通用张量加速器(VTA)是一种开放的.通用的.可定制的深度学习加速器,具有完整的基于TVM的编译器堆栈.设计了VTA来 ...
- 深度学习论文笔记-Deep Learning Face Representation from Predicting 10,000 Classes
来自:CVPR 2014 作者:Yi Sun ,Xiaogang Wang,Xiaoao Tang 题目:Deep Learning Face Representation from Predic ...
- 最实用的深度学习教程 Practical Deep Learning For Coders (Kaggle 冠军 Jeremy Howard 亲授)
Jeremy Howard 在业界可谓大名鼎鼎.他是大数据竞赛平台 Kaggle 的前主席和首席科学家.他本人还是 Kaggle 的冠军选手.他是美国奇点大学(Singularity Universi ...
- 深度学习框架Caffe —— Deep learning in Practice
因工作交接需要, 要将caffe使用方法及整体结构描述清楚. 鉴于也有同学问过我相关内容, 决定在本文中写个简单的tutorial, 方便大家参考. 本文简单的讲几个事情: Caffe能做什么? 为什 ...
- My deep learning reading list
My deep learning reading list 主要是顺着Bengio的PAMI review的文章找出来的.包括几本综述文章,将近100篇论文,各位山头们的Presentation.全部 ...
- 学习Data Science/Deep Learning的一些材料
原文发布于我的微信公众号: GeekArtT. 从CFA到如今的Data Science/Deep Learning的学习已经有一年的时间了.期间经历了自我的兴趣.擅长事务的探索和试验,有放弃了的项目 ...
- Deep learning Reading List
本文来自:http://jmozah.github.io/links/ Following is a growing list of some of the materials i found on ...
随机推荐
- window服务器上搭建git服务,window server git!!!
先给大家看一个高大上的,这是我给我公司配置的,小伙伴们都说好! 阿里云的2012server 基于这篇大神的教程,我把服务端搭建好了. 传送门,当然我还是自己做个笔记的好. 1.下载java,并安装 ...
- USACO 6.5 Betsy's Tour (插头dp)
Betsy's TourDon Piele A square township has been divided up into N2 square plots (1 <= N <= 7) ...
- Python学习笔记之爬虫
爬虫调度端:启动爬虫,停止爬虫,监视爬虫运行情况 URL管理器:对将要爬取的和已经爬取过的URL进行管理:可取出带爬取的URL,将其传送给“网页下载器”网页下载器:将URL指定的网页下载,存储成一个字 ...
- 41-2:和为S的连续正数序列
import java.util.ArrayList; /** * 面试题41-题目2:和为S的连续正数序列 * 小明很喜欢数学,有一天他在做数学作业时,要求计算出9~16的和,他马上就写出了正确答案 ...
- ecshop用户中心菜单选项显示内容标签
ecshop用户中心菜单选项有了,那肯定需要给相应的菜单选项添加内容,下面我们主要来讲下调用内容的标签,你也可以先访问一下用户中心菜单选项修改. 用户中心页面的内容分布在两个模板文件中:user_cl ...
- Hadoop整理二(Hadoop分布式存储系统HDFS)
一.背景 当数据集的大小超过一台独立物理计算机的存储能力时,就有必要对它进行分区(partition) 并存储到若干台单独的计算机上.管理网络中跨多台计算机存储的文件系统称为分布式文件系统 (dist ...
- gdg shell
export TIMESTAMP=`date +%Y%m%d_%H%M%S`GDGFILE=file1_${TIMESTAMP}.txtsuffix=${GDGFILE#*_}prefix=${suf ...
- js包
1.base.js /*语法: $("选择器") 工厂函数 */ /*寻找页面中name属性值是haha的元素*/ $("[name='haha']&qu ...
- [leetcode shell]194. Transpose File
Given a text file file.txt, transpose its content. You may assume that each row has the same number ...
- UNP学习总结(一)
本文主要为对UNP第五章部分内容的实验和总结. UNP第五章对一个echo服务器和客户端在各种连接状态下的表现做了详细的分析,包括了: 正常启动和终止: accept返回前连接中止: 服务器进程终止: ...