Awesome Reinforcement Learning
Awesome Reinforcement Learning
A curated list of resources dedicated to reinforcement learning.
We have pages for other topics: awesome-rnn, awesome-deep-vision, awesome-random-forest
Maintainers: Hyunsoo Kim, Jiwon Kim
We are looking for more contributors and maintainers!
Contributing
Please feel free to pull requests
Table of Contents
Codes
- Codes for examples and exercises in Richard Sutton and Andrew Barto's Reinforcement Learning: An Introduction
- Simulation code for Reinforcement Learning Control ProblemsMATLAB Environment and GUI for Reinforcement Learning
- Reinforcement Learning Repository - University of Massachusetts, Amherst
- Brown-UMBC Reinforcement Learning and Planning Library (Java)
- Reinforcement Learning in R (MDP, Value Iteration)
- Reinforcement Learning Environment in Python and MATLAB
- RL-Glue (standard interface for RL) and RL-Glue Library
- PyBrain Library - Python-Based Reinforcement learning, Artificial intelligence, and Neural network
- RLPy Framework - Value-Function-Based Reinforcement Learning Framework for Education and Research
- Maja - Machine learning framework for problems in Reinforcement Learning in python
- TeachingBox - Java based Reinforcement Learning framework
- Policy Gradient Reinforcement Learning Toolbox for MATLAB
- PIQLE - Platform Implementing Q-LEarning and other RL algorithms
- BeliefBox - Bayesian reinforcement learning library and toolkit
- Deep Q-Learning with Tensor Flow - A deep Q learning demonstration using Google Tensorflow
Theory
Lectures
- [UCL] COMPM050/COMPGI13 Reinforcement Learning by David Silver
- [UC Berkeley] CS188 Artificial Intelligence by Pieter Abbeel[Udacity (Georgia Tech.)] Machine Learning 3: Reinforcement Learning (CS7641)
- [Stanford] CS229 Machine Learning - Lecture 16: Reinforcement Learning by Andrew Ng
Books
- Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction [Book] [Code]
- Csaba Szepesvari, Algorithms for Reinforcement Learning [Book]
- David Poole and Alan Mackworth, Artificial Intelligence: Foundations of Computational Agents [Book Chapter]
- Dimitri P. Bertsekas and John N. Tsitsiklis, Neuro-Dynamic Programming [Book (Amazon)] [Summary]
- Mykel J. Kochenderfer, Decision Making Under Uncertainty: Theory and Application [Book (Amazon)]
Surveys
- Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement Learning: A Survey, JAIR, 1996. [Paper]
- S. S. Keerthi and B. Ravindran, A Tutorial Survey of Reinforcement Learning, Sadhana, 1994. [Paper]
- Matthew E. Taylor, Peter Stone, Transfer Learning for Reinforcement Learning Domains: A Survey, JMLR, 2009. [Paper]
- Jens Kober, J. Andrew Bagnell, Jan Peters, Reinforcement Learning in Robotics, A Survey, IJRR, 2013. [Paper]
- Michael L. Littman, "Reinforcement learning improves behaviour from evaluative feedback." Nature 521.7553 (2015): 445-451. [Paper]
- Marc P. Deisenroth, Gerhard Neumann, Jan Peter, A Survey on Policy Search for Robotics, Foundations and Trends in Robotics, 2014. [Book]
Papers / Thesis
Foundational Papers
- Marvin Minsky, Steps toward Artificial Intelligence, Proceedings of the IRE, 1961. [Paper]
- discusses issues in RL such as the "credit assignment problem"
- Ian H. Witten, An Adaptive Optimal Controller for Discrete-Time Markov Environments, Information and Control, 1977. [Paper]
- earliest publication on temporal-difference (TD) learning rule.
- Marvin Minsky, Steps toward Artificial Intelligence, Proceedings of the IRE, 1961. [Paper]
Methods
- Dynamic Programming (DP):
- Christopher J. C. H. Watkins, Learning from Delayed Rewards, Ph.D. Thesis, Cambridge University, 1989. [Thesis]
- Monte Carlo:
- Temporal-Difference:
- Richard S. Sutton, Learning to predict by the methods of temporal differences. Machine Learning 3: 9-44, 1988.[Paper]
- Q-Learning (Off-policy TD algorithm):
- Chris Watkins, Learning from Delayed Rewards, Cambridge, 1989. [Thesis]
- Sarsa (On-policy TD algorithm):
- R-Learning (learning of relative values)
- Andrew Schwartz, A Reinforcement Learning Method for Maximizing Undiscounted Rewards, ICML, 1993.[Paper-Google Scholar]
- Function Approximation methods (Least-Sqaure Temporal Difference, Least-Sqaure Policy Iteration)
- Policy Search / Policy Gradient
- Richard Sutton, David McAllester, Satinder Singh, Yishay Mansour, Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, 1999. [Paper]
- Jan Peters, Sethu Vijayakumar, Stefan Schaal, Natural Actor-Critic, ECML, 2005. [Paper]
- Jens Kober, Jan Peters, Policy Search for Motor Primitives in Robotics, NIPS, 2009. [Paper]
- Jan Peters, Katharina Mulling, Yasemin Altun, Relative Entropy Policy Search, AAAI, 2010. [Paper]
- Freek Stulp, Olivier Sigaud, Path Integral Policy Improvement with Covariance Matrix Adaptation, ICML, 2012.[Paper]
- Nate Kohl, Peter Stone, Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, 2004.[Paper]
- Marc Deisenroth, Carl Rasmussen, PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, 2011. [Paper]
- Scott Kuindersma, Roderic Grupen, Andrew Barto, Learning Dynamic Arm Motions for Postural Recovery, Humanoids, 2011. [Paper]
- Hierarchical RL
- Deep Learning + Reinforcement Learning (A sample of recent works on DL+RL)
- V. Mnih, et. al., Human-level Control through Deep Reinforcement Learning, Nature, 2015. [Paper]
- Xiaoxiao Guo, Satinder Singh, Honglak Lee, Richard Lewis, Xiaoshi Wang, Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, NIPS, 2014. [Paper]
- Sergey Levine, Chelsea Finn, Trevor Darrel, Pieter Abbeel, End-to-End Training of Deep Visuomotor Policies. ArXiv, 16 Oct 2015. [ArXiv]
- Tom Schaul, John Quan, Ioannis Antonoglou, David Silver, Prioritized Experience Replay, ArXiv, 18 Nov 2015.[ArXiv]
- Hado van Hasselt, Arthur Guez, David Silver, Deep Reinforcement Learning with Double Q-Learning, ArXiv, 22 Sep 2015. [ArXiv]
- Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu, Asynchronous Methods for Deep Reinforcement Learning, ArXiv, 4 Feb 2016.[ArXiv]
- Dynamic Programming (DP):
Applications
Game Playing
Traditional Games
Computer Games
- Human-level Control through Deep Reinforcement Learning (Mnih, Nature 2015) [Paper] [Code] [Video]
- Flappy Bird Reinforcement Learning [Video]
- MarI/O - learning to play Mario with evolutionary reinforcement learning using artificial neural networks (Stanley, Evolutionary Computation 2002) [Paper][Video]
Robotics
- Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (Kohl, ICRA 2004) [Paper]
- Robot Motor SKill Coordination with EM-based Reinforcement Learning (Kormushev, IROS 2010) [Paper] [Video]
- Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (Hester, ICRA 2010) [Paper] [Video]
- Autonomous Skill Acquisition on a Mobile Manipulator (Konidaris, AAAI 2011) [Paper] [Video]
- PILCO: A Model-Based and Data-Efficient Approach to Policy Search (Deisenroth, ICML 2011) [Paper]
- Incremental Semantically Grounded Learning from Demonstration (Niekum, RSS 2013) [Paper]
- Efficient Reinforcement Learning for Robots using Informative Simulated Priors (Cutler, ICRA 2015) [Paper] [Video]
Control
- An Application of Reinforcement Learning to Aerobatic Helicopter Flight (Abbeel, NIPS 2006) [Paper] [Video]
- Autonomous helicopter control using Reinforcement Learning Policy Search Methods (Bagnell, ICRA 2011) [Paper]
Operations Research
- Scaling Average-reward Reinforcement Learning for Product Delivery (Proper, AAAI 2004) [Paper]
- Cross Channel Optimized Marketing by Reinforcement Learning (Abe, KDD 2004) [Paper]
Human Computer Interaction
- Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (Singh, JAIR 2002)[Paper]
Tutorials / Websites
- Mance Harmon and Stephanie Harmon, Reinforcement Learning: A Tutorial
- Short introduction to some Reinforcement Learning algorithms
- C. Igel, M.A. Riedmiller, et al., Reinforcement Learning in a Nutshell, ESANN, 2007. [Paper]
- UNSW - Reinforcement LearningROS Reinforcement Learning Tutorial
- POMDP for Dummies
- Scholarpedia articles on:Repository with useful MATLAB Software, presentations, and demo videos
- Bibliography on Reinforcement Learning
- UC Berkeley - CS 294: Deep Reinforcement Learning, Fall 2015 (John Schulman, Pieter Abbeel) [Class Website]
- Blog posts on Reinforcement Learning, Parts 1-4 by Travis DeWolf
- The Arcade Learning Environment - Atari 2600 games environment for developing AI agents
- Deep Reinforcement Learning: Pong from Pixels by Andrej Karpathy
- Demystifying Deep Reinforcement Learning
Online Demos
- Real-world demonstrations of Reinforcement Learning
- Deep Q-Learning Demo - A deep Q learning demonstration using ConvNetJS
- Deep Q-Learning with Tensor Flow - A deep Q learning demonstration using Google Tensorflow
- Reinforcement Learning Demo - A reinforcement learning demo using reinforcejs by Andrej Karpathy
Awesome Reinforcement Learning的更多相关文章
- Machine Learning Algorithms Study Notes(5)—Reinforcement Learning
Reinforcement Learning 对于控制决策问题的解决思路:设计一个回报函数(reward function),如果learning agent(如上面的四足机器人.象棋AI程序)在决定 ...
- (转) Playing FPS games with deep reinforcement learning
Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...
- (zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
- (转) Deep Learning Research Review Week 2: Reinforcement Learning
Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...
- Learning Roadmap of Deep Reinforcement Learning
1. 知乎上关于DQN入门的系列文章 1.1 DQN 从入门到放弃 DQN 从入门到放弃1 DQN与增强学习 DQN 从入门到放弃2 增强学习与MDP DQN 从入门到放弃3 价值函数与Bellman ...
- Open source packages on Deep Reinforcement Learning
智能车 self driving car + 强化学习 reinforcement learning + 神经网络 模拟 https://github.com/MorvanZhou/my_resear ...
- (转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
- 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...
- getting started with building a ROS simulation platform for Deep Reinforcement Learning
Apparently, this ongoing work is to make a preparation for futural research on Deep Reinforcement Le ...
- (转) Deep Learning in a Nutshell: Reinforcement Learning
Deep Learning in a Nutshell: Reinforcement Learning Share: Posted on September 8, 2016by Tim Dettm ...
随机推荐
- python3控制路由器--使用requests重启极路由.py
代码写了相应的注释,以后再写成可以方便调用的模块. 用fiddler抓包可以看到很多HTTP头,经过尝试发现不是都必须的. 'Upgrade-Insecure-Requests':1,#必要项,值为1 ...
- python几大排序算法
1.插入排序 原理:有数列[k1,k2,k3...],假设k1是排好序的,插入k2,排序完成,然后再插入k3,以此类推 def insert_sort(arr): for i in range(1,l ...
- poj2891 拓展欧几里得
//Accepted 164 KB 16 ms //拓展欧几里得 //m=a1*x+b1 --(1) //m=a2*(-y)+b2 --(2) //->a1*x+a2*y=b2-b1 //由欧几 ...
- html<textarea>标签
最近在项目中页面回显<textarea>的值,可是设置了value属性怎么也回显不出来,后来才弄清楚,原来想要设置<textarea>的文本,不是使用value,而是如下方式: ...
- hdu 2080
ps:水题...求夹角...先求出COS,然后用acos 代码: #include "stdio.h" #include "math.h" int main() ...
- UILocalNotification详解
以下内容来自网络 本地Notification所使用的对象是UILocalNotification,UILocalNotification的属性涵盖了所有处理Notification需要的内容.UIL ...
- 爆破vcrkme01(已补上注册机)
系统 : Windows xp 程序 : vcrkme01 程序下载地址 :http://pan.baidu.com/s/1mh1n33y 要求 : 爆破 使用工具 :OD 可在“PEDIY Crac ...
- Ztree实现带checkBox的下拉框
UI <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="ArticleMove ...
- SQL 面试题及答案(一)
1. SQL 语法:update set from: http://wenku.baidu.com/link?url=aVr5EbEmx-pNK86rdnas8YDWG8txjg8GEry-HU_dF ...
- mark资料-selenium断言的分类
操作(action).辅助(accessors)和断言(assertion): 操作action: 模拟用户与 Web 应用程序的交互. 辅助accessors: 这是辅助工具.用于检查应用程序的状态 ...