Deep Reinforcement Learning Papers

A list of recent papers regarding deep reinforcement learning.
The papers are organized based on manually-defined bookmarks.
They are sorted by time to see the recent papers first.
Any suggestions and pull requests are welcome.

Bookmarks

All Papers

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
Learning Simple Algorithms from Examples, W. Zaremba et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al., arXiv, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al., arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al., RSS, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Deterministic Policy Gradient Algorithms, D. Silver et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

Value

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
Learning Simple Algorithms from Examples, W. Zaremba et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

Policy

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
Deterministic Policy Gradient Algorithms, D. Silver et al., ICML, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.

Discrete Control

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
Learning Simple Algorithms from Examples, W. Zaremba et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al., arXiv, 2015.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al., arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

Continuous Control

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al., RSS, 2015.
Deterministic Policy Gradient Algorithms, D. Silver et al., ICML, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.

Text Domain

Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al., arXiv, 2015.

Visual Domain

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

Robotics

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al., RSS, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.

Games

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

Monte-Carlo Tree Search

Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.

Inverse Reinforcement Learning

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al., arXiv, 2015.

Multi-Task and Transfer Learning

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.

Improving Exploration

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.

Multi-Agent

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.

Hierarchical Learning

Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.

(zhuan) Deep Reinforcement Learning Papers的更多相关文章

(转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
(转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
论文笔记之：Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
【资料总结】| Deep Reinforcement Learning 深度强化学习
在机器学习中,我们经常会分类为有监督学习和无监督学习,但是尝尝会忽略一个重要的分支,强化学习.有监督学习和无监督学习非常好去区分,学习的目标,有无标签等都是区分标准.如果说监督学习的目标是预测,那么强 ...
Deep Reinforcement Learning
Reinforcement-Learning-Introduction-Adaptive-Computation http://incompleteideas.net/book/bookdraft20 ...
Deep Reinforcement Learning with Iterative Shift for Visual Tracking
Deep Reinforcement Learning with Iterative Shift for Visual Tracking 2019-07-30 14:55:31 Paper: http ...
深度强化学习（Deep Reinforcement Learning）入门：RL base & DQN-DDPG-A3C introduction
转自https://zhuanlan.zhihu.com/p/25239682 过去的一段时间在深度强化学习领域投入了不少精力,工作中也在应用DRL解决业务问题.子曰:温故而知新,在进一步深入研究和应 ...
(转) Playing FPS games with deep reinforcement learning
Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...
Learning Roadmap of Deep Reinforcement Learning
1. 知乎上关于DQN入门的系列文章 1.1 DQN 从入门到放弃 DQN 从入门到放弃1 DQN与增强学习 DQN 从入门到放弃2 增强学习与MDP DQN 从入门到放弃3 价值函数与Bellman ...

随机推荐

关于JavaScript和html的随笔
最近听了一些关于JavaScript和html的讲课和读了一些书籍.因为我是给项目做网站知道的,所以要特别的注意和努力.JavaScript是一门挺好用的脚本语言,比较简单灵活,在这上面我深有体会,因 ...
几种通过JDBC操作数据库的方法，以及返回数据的处理
1.SQL TO String :只返回一个查询结果例如查询某条记录的总数 rs = stmt.executeQuery(replacedCommand); if (rs ! ...
python的各种编辑器-PyScripter、pycharm 、atom、vscode、Sublime Text等等
RT,本文主要列举python的各种编辑器-PyScripter.pycharm .atom.vscode.Sublime Text等等. PyScripter 开源免费 windows only ...
Unity中游戏的声音管理
using UnityEngine;using System.Collections;using System.Collections.Generic;/// <summary>/// 用 ...
一次偶然的Java内存溢出引发的思考
据说一次SQL查询返回太多数据,会引起服务器内存溢出. 不过,我现在碰到的情况是,调用一个Postgresql 存储过程,很复杂,那么在其中有很多raise notice这样的调试语句,如果碰巧有个死 ...
使用Github Pages建独立博客
http://beiyuu.com/github-pages/ Github很好的将代码和社区联系在了一起,于是发生了很多有趣的事情,世界也因为他美好了一点点.Github作为现在最流行的代码仓库,已 ...
BaseDao代码,用于连接数据库实行增删改查等操作
在学习JavaWeb时会用到此代码,用于实行增删改查操作 1 package com.bdqn.dao; import java.sql.Connection; import java.sql.Dri ...
seajs封装js方法
必须要先引入sea.js文件 <script src="js/sea.js"></script> 其次,引入其他js文件 <script> se ...
iOS UITextField限制输入数字
有时候项目中要求文本框中只能输入数字,如:价格.公里数.费用等等,一般的文本框不限制输入的格式,这时候只能强制限制输入框的输入格式了,代码如下: #import "ViewControlle ...
iOS按钮设置图片在上文字在下
UIButton同时设置Title和Image后,默认是图片在左文字在右,如下图1,很多情况下我们希望图片在上图片在下,如下图2,只需要简单的几行代码,即可实现. (1)因为需要处理多个按钮,所以将实 ...

(zhuan) Deep Reinforcement Learning Papers