The manuscript of Deep Reinforcement Learning is available now! It makes significant improvements to Deep Reinforcement Learning: An Overview, which has received 100+ citations, by extending its latest version more than one year ago from 70 pages to 150 pages.

It draws a big picture of deep reinforcement learning (RL) with many details. It covers contemporary work in historical contexts. It endeavours to answer the following questions: 1) Why deep? 2) What is the state of the art? and, 3) What are the issues, and potential solutions? It attempts to help those who want to get more familiar with deep RL, and to serve as a reference for people interested in this fascinating area, like professors, researchers, students, engineers, managers, investors, etc. Shortcomings and mistakes are inevitable; comments and criticisms are welcome.

The manuscript introduces AI, machine learning, and deep learning briefly, and provides a mini tutorial for reinforcement learning. The following figure illustrates relationships among these concepts, with major contents for machine learning and AI .Deep reinforcement learning is reinforcement learning integrated with deep learning, or deep artificial neural networks. A blog is dedicated to Resources for Deep Reinforcement Learning.

 

The manuscript covers six core elements: value function, policy, reward, model, exploration vs. exploitation, and representation; six important mechanisms: attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn; and twelve applications: games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art.

 

Deep reinforcement learning has made exceptional achievements, e.g., DQN applying to Atari games ignited this wave of deep RL, and AlphaGo (Zero) and DeepStack set landmarks for AI. Deep RL has many newly invented algorithms/architectures, e.g., DQNA3CTRPOPPODDPGTrust-PCLGPSUNREALDNC, etc. Moreover, deep RL has been enjoying very abound and diverse applications, e.g., Capture the FlagDota 2StarCraft IIroboticscharacter animationconversational AIneural architecture design (AutoML)data center coolingrecommender systemsdata augmentationmodel compressioncombinatorial optimizationprogram synthesistheorem provingmedical imagingmusic, and chemical retrosynthesis, so on and so forth. A blog is dedicated to Reinforcement Learning applications.

In general, RL is probably helpful, if a problem can be regarded as or transformed to a sequential decision making problem, and states, actions, maybe rewards, can be constructed; sometimes the problem may not appear as an RL problem on the surface. Roughly speaking, if a task involves some manual designed “strategy”, then there is a chance for reinforcement learning to help. Creativity would push the frontiers of deep RL further with respect to core elements, important mechanisms, and applications.

Albeit being so successful, deep RL encounters many issues, like credit assignment, sparse reward, sample efficiency, instability, divergence, interpretability, safety, etc.; even reproducibility is an issue.

 

Six research directions are proposed as both challenges and opporrtunities. There are already some progress in these directions, e.g., DopamineTStarBotsMORELGQN, visual reasoningneural-symbolic learningUPNcausal InfoGANmeta-gradient RL, along with many applications as above.

  1. systematic, comparative study of deep RL algorithms
  2. “solve” multi-agent problems
  3. learn from entities, but not just raw inputs
  4. design an optimal representation for RL
  5. AutoRL
  6. develop killer applications for (deep) RL

It is desirable to integrate RL more deeply with AI, with more intelligence in the end-to-end mapping from raw inputs to decisions, to incorporate knowledge, to have common sense, to be more efficient, to be more interpretable, and to avoid obvious mistakes, etc., rather than working as a blackbox.

 

Deep learning and reinforcement learning, being selected as one of the MIT Technology Review 10 Breakthrough Technologies in 2013 and 2017 respectively, will play their crucial roles in achieving artificial general intelligence. David Silver proposed a conjecture: artificial intelligence = reinforcement learning + deep learning (AI = RL + DL). We will see both deep learning and reinforcement learning prospering in the coming years and beyond. Deep learning is exploding. It is the right time to nurture, educate and lead the market for reinforcement learning.

Deep learning, in this third wave of AI, will have deeper influences, as we have already seen from its many achievements. Reinforcement learning, as a more general learning and decision making paradigm, will deeply influence deep learning, machine learning, and artificial intelligence in general.

Introducing Deep Reinforcement的更多相关文章

  1. 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning

    Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...

  2. (转) Playing FPS games with deep reinforcement learning

    Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...

  3. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  4. Learning Roadmap of Deep Reinforcement Learning

    1. 知乎上关于DQN入门的系列文章 1.1 DQN 从入门到放弃 DQN 从入门到放弃1 DQN与增强学习 DQN 从入门到放弃2 增强学习与MDP DQN 从入门到放弃3 价值函数与Bellman ...

  5. (转) Deep Reinforcement Learning: Playing a Racing Game

    Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...

  6. 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning

    Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...

  7. getting started with building a ROS simulation platform for Deep Reinforcement Learning

    Apparently, this ongoing work is to make a preparation for futural research on Deep Reinforcement Le ...

  8. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  9. 论文笔记之:Deep Reinforcement Learning with Double Q-learning

    Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...

随机推荐

  1. 工业通信的开源项目 HslCommunication 介绍

    前言: 本项目的孵化说来也是机缘巧合的事,本人于13年杭州某大学毕业后去了一家大型的国企工作,慢慢的走上了工业软件,上位机软件开发的道路.于14年正式开发基于windows的软件,当时可选的技术栈就是 ...

  2. 基础练习 Sine之舞

    基础练习 Sine之舞   时间限制:1.0s   内存限制:512.0MB        问题描述 最近FJ为他的奶牛们开设了数学分析课,FJ知道若要学好这门课,必须有一个好的三角函数基本功.所以他 ...

  3. matlab fopen()

    file=dir('/home/wang/Desktop/trainset/others/'); :length(file) path= strcat('/home/wang/Desktop/trai ...

  4. 20155310 2016-2017-2 《Java程序设计》第八周学习总结

    20155310 2016-2017-2 <Java程序设计>第八周学习总结 教材学习内容总结 第十五章 通用API 通用API •日志:日志对信息安全意义重大,审计.取证.入侵检验等都会 ...

  5. AngularJS的简单订阅发布模式例子

    控制器之间的交互方式广播 broadcast, 发射 emit 事件 类似于 js中的事件 , 可以自己定义事件 向上传递直到 document 在AngularJs中 向上传递直到 rootScop ...

  6. .NET 编写一个可以异步等待循环中任何一个部分的 Awaiter

    林德熙 小伙伴希望保存一个文件,并且希望如果出错了也要不断地重试.然而我认为如果一直错误则应该对外抛出异常让调用者知道为什么会一直错误. 这似乎是一个矛盾的要求.然而最终我想到了一个办法:让重试一直进 ...

  7. (2)bytes类型

    bytes类型就是字节类型 把8个二进制一组称为一个byte,用16进制来表示 Python2里面字符串其实更应该称为字节串,但是python2里面有一个类型是butes,所以在Python2里面by ...

  8. WPF控件NumericUpDown (转)

    WPF控件NumericUpDown示例 (转载请注明出处) 工具:Expression Blend 2 + Visual Studio 2008 语言:C# 框架:.Net Framework 3. ...

  9. cin和cout详解

    无论输入数字还是字符串,一个回车键是把输入的这个东西送到变量中,可以一次性送到 一个(或者多个)空格键是分隔这些值的 cout <<N; for(int i=0;i<5;i++) { ...

  10. jquery append、prepend、before等等

    1.jQuery append() 方法 jQuery append() 方法在被选元素的结尾插入内容. 实例 复制代码代码如下: $("p").append("Some ...