The manuscript of Deep Reinforcement Learning is available now! It makes significant improvements to Deep Reinforcement Learning: An Overview, which has received 100+ citations, by extending its latest version more than one year ago from 70 pages to 150 pages.

It draws a big picture of deep reinforcement learning (RL) with many details. It covers contemporary work in historical contexts. It endeavours to answer the following questions: 1) Why deep? 2) What is the state of the art? and, 3) What are the issues, and potential solutions? It attempts to help those who want to get more familiar with deep RL, and to serve as a reference for people interested in this fascinating area, like professors, researchers, students, engineers, managers, investors, etc. Shortcomings and mistakes are inevitable; comments and criticisms are welcome.

The manuscript introduces AI, machine learning, and deep learning briefly, and provides a mini tutorial for reinforcement learning. The following figure illustrates relationships among these concepts, with major contents for machine learning and AI .Deep reinforcement learning is reinforcement learning integrated with deep learning, or deep artificial neural networks. A blog is dedicated to Resources for Deep Reinforcement Learning.

 

The manuscript covers six core elements: value function, policy, reward, model, exploration vs. exploitation, and representation; six important mechanisms: attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn; and twelve applications: games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art.

 

Deep reinforcement learning has made exceptional achievements, e.g., DQN applying to Atari games ignited this wave of deep RL, and AlphaGo (Zero) and DeepStack set landmarks for AI. Deep RL has many newly invented algorithms/architectures, e.g., DQNA3CTRPOPPODDPGTrust-PCLGPSUNREALDNC, etc. Moreover, deep RL has been enjoying very abound and diverse applications, e.g., Capture the FlagDota 2StarCraft IIroboticscharacter animationconversational AIneural architecture design (AutoML)data center coolingrecommender systemsdata augmentationmodel compressioncombinatorial optimizationprogram synthesistheorem provingmedical imagingmusic, and chemical retrosynthesis, so on and so forth. A blog is dedicated to Reinforcement Learning applications.

In general, RL is probably helpful, if a problem can be regarded as or transformed to a sequential decision making problem, and states, actions, maybe rewards, can be constructed; sometimes the problem may not appear as an RL problem on the surface. Roughly speaking, if a task involves some manual designed “strategy”, then there is a chance for reinforcement learning to help. Creativity would push the frontiers of deep RL further with respect to core elements, important mechanisms, and applications.

Albeit being so successful, deep RL encounters many issues, like credit assignment, sparse reward, sample efficiency, instability, divergence, interpretability, safety, etc.; even reproducibility is an issue.

 

Six research directions are proposed as both challenges and opporrtunities. There are already some progress in these directions, e.g., DopamineTStarBotsMORELGQN, visual reasoningneural-symbolic learningUPNcausal InfoGANmeta-gradient RL, along with many applications as above.

  1. systematic, comparative study of deep RL algorithms
  2. “solve” multi-agent problems
  3. learn from entities, but not just raw inputs
  4. design an optimal representation for RL
  5. AutoRL
  6. develop killer applications for (deep) RL

It is desirable to integrate RL more deeply with AI, with more intelligence in the end-to-end mapping from raw inputs to decisions, to incorporate knowledge, to have common sense, to be more efficient, to be more interpretable, and to avoid obvious mistakes, etc., rather than working as a blackbox.

 

Deep learning and reinforcement learning, being selected as one of the MIT Technology Review 10 Breakthrough Technologies in 2013 and 2017 respectively, will play their crucial roles in achieving artificial general intelligence. David Silver proposed a conjecture: artificial intelligence = reinforcement learning + deep learning (AI = RL + DL). We will see both deep learning and reinforcement learning prospering in the coming years and beyond. Deep learning is exploding. It is the right time to nurture, educate and lead the market for reinforcement learning.

Deep learning, in this third wave of AI, will have deeper influences, as we have already seen from its many achievements. Reinforcement learning, as a more general learning and decision making paradigm, will deeply influence deep learning, machine learning, and artificial intelligence in general.

Introducing Deep Reinforcement的更多相关文章

  1. 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning

    Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...

  2. (转) Playing FPS games with deep reinforcement learning

    Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...

  3. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  4. Learning Roadmap of Deep Reinforcement Learning

    1. 知乎上关于DQN入门的系列文章 1.1 DQN 从入门到放弃 DQN 从入门到放弃1 DQN与增强学习 DQN 从入门到放弃2 增强学习与MDP DQN 从入门到放弃3 价值函数与Bellman ...

  5. (转) Deep Reinforcement Learning: Playing a Racing Game

    Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...

  6. 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning

    Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...

  7. getting started with building a ROS simulation platform for Deep Reinforcement Learning

    Apparently, this ongoing work is to make a preparation for futural research on Deep Reinforcement Le ...

  8. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  9. 论文笔记之:Deep Reinforcement Learning with Double Q-learning

    Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...

随机推荐

  1. Skynet服务器框架(十) CentOS 防火墙设置

    引言: 今天修改了 skynet 服务器的 IP 地址(即 config 文件中的 address 和 master 两项参数,IP 与当前及其的保持一致,端口号为 2017),然后使用一个简单的客户 ...

  2. Flume-NG源码阅读之SpoolDirectorySource(原创)

    org.apache.flume.source.SpoolDirectorySource是flume的一个常用的source,这个源支持从磁盘中某文件夹获取文件数据.不同于其他异步源,这个源能够避免重 ...

  3. Gym 101630(NEERC 17) D.Designing the Toy

    题目大意:给出三视图方向上分别能看到的正方形数a,b,c(1<=a,b,c<=100),在-100<=x,y,z<=100的范围内构造出满足情况的一种正方体的摆放方式 做法很简 ...

  4. 【WebForm】知识笔记

    一.ashx介绍以及ashx文件与aspx文件之间的区别 ashx是什么文件? .ashx 文件用于写web handler的. .ashx文件与.aspx文件类似,可以通过它来调用HttpHandl ...

  5. 【java规则引擎】《Drools7.0.0.Final规则引擎教程》第3章 3.2 KIE API解析

    转载至:https://blog.csdn.net/wo541075754/article/details/75004575 3.2.4 KieServices 该接口提供了很多方法,可以通过这些方法 ...

  6. 使用Visual Studio Code开发Asp.Net Core WebApi学习笔记(一)-- 起步

    本文记录了在Windows环境下安装Visual Studio Code开发工具..Net Core 1.0 SDK和开发一个简单的Web-Demo网站的全过程. 一.安装Visual Studio ...

  7. 使用systemd严格保证启动顺序

    需求: 服务B要在服务A之后启动,且由于存在强内在依赖关系,B必须在A完成初始化之后才能被启动. 解决方法: 首先使用systemd,service脚本需要配置服务B要after服务A. 其次,A服务 ...

  8. js ==与===区别(非严格相等与严格相等)

    基本数据类型:number.string.boolean.undefined.null 高级数据类型:object 表格形式比较: 比较的类型 == === 基础类型 不同基础类型间比较," ...

  9. Refused to display '[url]' in a frame because it set 'X-Frame-Options' to 'Deny'.

    X-Frame-Options是一个HTTP标头(header),用来告诉浏览器这个网页是否可以放在iFrame内.例如: X-Frame-Options: DENY X-Frame-Options: ...

  10. tyvj1061Mobile Service

    题目:http://www.joyoi.cn/problem/tyvj-1061 dp.枚举三个人现在的位置. 1.重点:当前必有一人正处在查询点上!于是省掉一维. 2.转移方程枚举上一阶段的 j 和 ...