Introducing Deep Reinforcement
The manuscript of Deep Reinforcement Learning is available now! It makes significant improvements to Deep Reinforcement Learning: An Overview, which has received 100+ citations, by extending its latest version more than one year ago from 70 pages to 150 pages.
It draws a big picture of deep reinforcement learning (RL) with many details. It covers contemporary work in historical contexts. It endeavours to answer the following questions: 1) Why deep? 2) What is the state of the art? and, 3) What are the issues, and potential solutions? It attempts to help those who want to get more familiar with deep RL, and to serve as a reference for people interested in this fascinating area, like professors, researchers, students, engineers, managers, investors, etc. Shortcomings and mistakes are inevitable; comments and criticisms are welcome.
The manuscript introduces AI, machine learning, and deep learning briefly, and provides a mini tutorial for reinforcement learning. The following figure illustrates relationships among these concepts, with major contents for machine learning and AI .Deep reinforcement learning is reinforcement learning integrated with deep learning, or deep artificial neural networks. A blog is dedicated to Resources for Deep Reinforcement Learning.

The manuscript covers six core elements: value function, policy, reward, model, exploration vs. exploitation, and representation; six important mechanisms: attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn; and twelve applications: games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art.

Deep reinforcement learning has made exceptional achievements, e.g., DQN applying to Atari games ignited this wave of deep RL, and AlphaGo (Zero) and DeepStack set landmarks for AI. Deep RL has many newly invented algorithms/architectures, e.g., DQN, A3C, TRPO, PPO, DDPG, Trust-PCL, GPS, UNREAL, DNC, etc. Moreover, deep RL has been enjoying very abound and diverse applications, e.g., Capture the Flag, Dota 2, StarCraft II, robotics, character animation, conversational AI, neural architecture design (AutoML), data center cooling, recommender systems, data augmentation, model compression, combinatorial optimization, program synthesis, theorem proving, medical imaging, music, and chemical retrosynthesis, so on and so forth. A blog is dedicated to Reinforcement Learning applications.
In general, RL is probably helpful, if a problem can be regarded as or transformed to a sequential decision making problem, and states, actions, maybe rewards, can be constructed; sometimes the problem may not appear as an RL problem on the surface. Roughly speaking, if a task involves some manual designed “strategy”, then there is a chance for reinforcement learning to help. Creativity would push the frontiers of deep RL further with respect to core elements, important mechanisms, and applications.
Albeit being so successful, deep RL encounters many issues, like credit assignment, sparse reward, sample efficiency, instability, divergence, interpretability, safety, etc.; even reproducibility is an issue.

Six research directions are proposed as both challenges and opporrtunities. There are already some progress in these directions, e.g., Dopamine, TStarBots, MOREL, GQN, visual reasoning, neural-symbolic learning, UPN, causal InfoGAN, meta-gradient RL, along with many applications as above.
- systematic, comparative study of deep RL algorithms
- “solve” multi-agent problems
- learn from entities, but not just raw inputs
- design an optimal representation for RL
- AutoRL
- develop killer applications for (deep) RL
It is desirable to integrate RL more deeply with AI, with more intelligence in the end-to-end mapping from raw inputs to decisions, to incorporate knowledge, to have common sense, to be more efficient, to be more interpretable, and to avoid obvious mistakes, etc., rather than working as a blackbox.

Deep learning and reinforcement learning, being selected as one of the MIT Technology Review 10 Breakthrough Technologies in 2013 and 2017 respectively, will play their crucial roles in achieving artificial general intelligence. David Silver proposed a conjecture: artificial intelligence = reinforcement learning + deep learning (AI = RL + DL). We will see both deep learning and reinforcement learning prospering in the coming years and beyond. Deep learning is exploding. It is the right time to nurture, educate and lead the market for reinforcement learning.
Deep learning, in this third wave of AI, will have deeper influences, as we have already seen from its many achievements. Reinforcement learning, as a more general learning and decision making paradigm, will deeply influence deep learning, machine learning, and artificial intelligence in general.
Introducing Deep Reinforcement的更多相关文章
- 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
- (转) Playing FPS games with deep reinforcement learning
Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...
- (zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
- Learning Roadmap of Deep Reinforcement Learning
1. 知乎上关于DQN入门的系列文章 1.1 DQN 从入门到放弃 DQN 从入门到放弃1 DQN与增强学习 DQN 从入门到放弃2 增强学习与MDP DQN 从入门到放弃3 价值函数与Bellman ...
- (转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
- 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...
- getting started with building a ROS simulation platform for Deep Reinforcement Learning
Apparently, this ongoing work is to make a preparation for futural research on Deep Reinforcement Le ...
- (转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
- 论文笔记之:Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...
随机推荐
- PAT 天梯赛 是否完全二叉搜索树 (30分)(二叉搜索树 数组)
将一系列给定数字顺序插入一个初始为空的二叉搜索树(定义为左子树键值大,右子树键值小),你需要判断最后的树是否一棵完全二叉树,并且给出其层序遍历的结果. 输入格式: 输入第一行给出一个不超过20的正整数 ...
- linux C/C++服务器后台开发面试题总结
一.编程语言 1.根据熟悉的语言,谈谈两种语言的区别? 主要浅谈下C/C++和PHP语言的区别: 1)PHP弱类型语言,一种脚本语言,对数据的类型不要求过多,较多的应用于Web应用开发,现在好多互联网 ...
- windows运行打开服务命令
1. gpedit.msc-----组策略 2. sndrec32-------录音机 3. Nslookup-------IP地址侦测器 4. explorer-------打开资源管 ...
- hdu2090-2097
hdu2090 模拟 #include<stdio.h> int main(){ ]; ,a1,a2; ; while(scanf("%s%lf%lf",b,& ...
- 【网络通讯】Nat知识了解
一.Nat的含义 NAT(Network Address Translation,网络地址转换)是1994年提出的.当在专用网内部的一些主机本来已经分配到了本地IP地址(即仅在本专用网内使用的专用地址 ...
- JNI学习笔记_Java调用C —— Android中使用的方法
一.笔记 1.JNI(Java Native Interface),就是如何使用java去访问C/C++编写的那些库.若想深入了解JNI可以看官方文档jni.pdf.优秀博文:Android JNI知 ...
- ubuntu14.04下安装ffmpeg
ubuntu14.04下安装ffmpeg 一.安装各种依赖包 1.yasm(libx264需要依赖yasm) sudo apt-get install yasm 2.libx264 sudo apt- ...
- Oracle 存储过程了解
简要记录存储过程语法与Java程序的调用方式 一 存储过程 首先,我们建立一个简单的表进行存储过程的测试 createtable xuesheng(id integer, xing_ming varc ...
- pycharm -- 导入主题(theme) and 修改背景颜色(护眼色)
前情提要 众所周知,随着python语言的不断流行,越来越多的程序员开始用python来开发自己的项目以及产品. pycharm作为一款流行的IDE,被越来越多的程序员所接受和使用. 尽管pychar ...
- 【转】每天一个linux命令(25):linux文件属性详解
原文网址:http://www.cnblogs.com/peida/archive/2012/11/23/2783762.html Linux 文件或目录的属性主要包括:文件或目录的节点.种类.权限模 ...