Stadie, Bradly C., Sergey Levine, and Pieter Abbeel. "Incentivizing exploration in reinforcement learning with deep predictive models." arXiv preprint arXiv:1507.00814 (2015).

作者通过模拟(状态,动作)的不确定性,从而修改reward,帮助agent进行探索。作者说用了他们的方法不用进行随机探索。该方法比较通用,适用于多种RL模型,但是要训练auto-encoder,所以也稍微有点繁琐。

实用指数:3颗星

理论指数:1颗星

创新指数:4颗星

Incentivizing exploration in reinforcement learning with deep predictive models的更多相关文章

  1. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  2. 深度学习国外课程资料(Deep Learning for Self-Driving Cars)+(Deep Reinforcement Learning and Control )

    MIT(Deep Learning for Self-Driving Cars) CMU(Deep Reinforcement Learning and Control ) 参考网址: 1 Deep ...

  3. 18 Issues in Current Deep Reinforcement Learning from ZhiHu

    深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章 深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...

  4. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  5. (转) Deep Reinforcement Learning: Playing a Racing Game

    Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...

  6. (转) Deep Learning in a Nutshell: Reinforcement Learning

    Deep Learning in a Nutshell: Reinforcement Learning   Share: Posted on September 8, 2016by Tim Dettm ...

  7. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  8. 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning

    Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...

  9. 论文笔记之:Deep Reinforcement Learning with Double Q-learning

    Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...

随机推荐

  1. Unix环境高级编程(三)标准I/O库

    标准I/O库是ISO C的标准,在很多操作系统上面都实现.Unix文件I/O函数都是针对文件描述符的,当打开一个文件的时候,返回该文件描述符用于后续的I/O操作.而对于标准I/O库,操作则是围绕流进行 ...

  2. 转 docker 部署 kafka

    原文链接 http://blog.csdn.net/snowcity1231/article/details/54946857 -e KAFKA_BROKER_ID=1 -e ZK=zk -p 909 ...

  3. 页面返回刷新或H5监听(手机的)返回键

    1. pushHistory(); window.addEventListener("popstate", function(e) { alert("我监听到了浏览器的返 ...

  4. 【iOS】自动引用计数 (循环引用)

    历史版本 ARC(Automatic Reference Counting,自动引用计数)极大地减少了Cocoa开发中的常见编程错误:retain跟release不匹配.ARC并不会消除对retain ...

  5. SpringCloud分布式开发五大神兽

    SpringCloud分布式开发五大神兽 服务发现——Netflix Eureka 客服端负载均衡——Netflix Ribbon 断路器——Netflix Hystrix 服务网关——Netflix ...

  6. Let's call it a "return"

    Preface Long time no see. For some reason that I failed to keep updating this blog, which is really ...

  7. 关于navigationBar的颜色计算与默认透明度

    NavigationBar的默认透明度为85% 颜色叠加计算公式: RGB需要分开计算,以叠加白色背景为例: 原始颜色(217,10,20) 设置透明度85% 计算过程: R=217 + (255-2 ...

  8. 关于NavigationItem.rightBarButtonItem设置

    转自:http://blog.csdn.net/zhuzhihai1988/article/details/7701998 第一种: UIImage *searchimage=[UIImage ima ...

  9. Spring Boot干货系列:(一)优雅的入门篇

    Spring Boot干货系列:(一)优雅的入门篇 2017-02-26 嘟嘟MD 嘟爷java超神学堂   前言 Spring一直是很火的一个开源框架,在过去的一段时间里,Spring Boot在社 ...

  10. HttpWebRequest类与HttpRequest类的区别

    HttpRequest类的对象用于服务器端,获取客户端传来的请求的信息,包括HTTP报文传送过来的所有信息.而HttpWebRequest用于客户端,拼接请求的HTTP报文并发送等. HttpWebR ...