Incentivizing exploration in reinforcement learning with deep predictive models

Stadie, Bradly C., Sergey Levine, and Pieter Abbeel. "Incentivizing exploration in reinforcement learning with deep predictive models." arXiv preprint arXiv:1507.00814 (2015).

作者通过模拟(状态，动作)的不确定性，从而修改reward，帮助agent进行探索。作者说用了他们的方法不用进行随机探索。该方法比较通用，适用于多种RL模型，但是要训练auto-encoder，所以也稍微有点繁琐。

实用指数：3颗星

理论指数：1颗星

创新指数：4颗星

Incentivizing exploration in reinforcement learning with deep predictive models的更多相关文章

(zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
深度学习国外课程资料(Deep Learning for Self-Driving Cars)+(Deep Reinforcement Learning and Control )
MIT(Deep Learning for Self-Driving Cars) CMU(Deep Reinforcement Learning and Control ) 参考网址: 1 Deep ...
18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...
(转) Deep Learning Research Review Week 2: Reinforcement Learning
Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...
(转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
(转) Deep Learning in a Nutshell: Reinforcement Learning
Deep Learning in a Nutshell: Reinforcement Learning Share: Posted on September 8, 2016by Tim Dettm ...
(转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
论文笔记之：Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
论文笔记之：Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...

随机推荐

Android静态图片人脸识别的完整demo(附完整源码)
Demo功能:利用android自带的人脸识别进行识别,标记出眼睛和人脸位置.点击按键后进行人脸识别,完毕后显示到imageview上. 第一部分:布局文件activity_main.xml < ...
Android判断当前网络是否可用--示例代码
Android判断当前网络是否可用--示例代码分类: *07 Android 2011-05-24 13:46 7814人阅读评论(4) 收藏举报网络androiddialogmanagern ...
山东大学硕士/博士研究生毕业论文--Latex模板
山东大学硕士/博士研究生毕业论文Latex模板模板下载地址: https://github.com/Tsingke/SDU_thesis_template_for_postgraduate 封皮预 ...
python 异步编程
Python 3.5 协程究竟是个啥 Yushneng · Mar 10th, 2016 原文链接 : How the heck does async/await work in Python 3.5 ...
js下拉框二级关联菜单效果代码具体实现
这篇文章介绍了js下拉框二级关联菜单效果代码具体实现,有需要的朋友可以参考一下 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transit ...
深度学习FPGA实现基础知识10(Deep Learning（深度学习）卷积神经网络(Convolutional Neural Network，CNN))
需求说明:深度学习FPGA实现知识储备来自:http://blog.csdn.net/stdcoutzyx/article/details/41596663 说明:图文并茂,言简意赅. 自今年七月份 ...
【Android】4.3 屏幕布局和旋转
分类:C#.Android.VS2015:创建日期:2016-02-06 为了控制屏幕的放置方向(纵向.横向),可以在Resource下同时定义两种不同的布局文件夹:layout和layout-lan ...
solr中时区处理
solr.in.sh中的最后 # By default the start script uses UTC; override the timezone if needed SOLR_TIMEZONE ...
CAsyncSocket编程 MFC
许多时候我们实现网络编程使用的是winsock api函数,虽然这些函数使用起来也很方便,很灵活,但是VC++的MFC类库中提供了CAsyncSocket这样一个套接字类,用它来实现socket编程会 ...
RTX——第8章任务优先级修改
以下内容转载自安富莱电子: http://forum.armfly.com/forum.php 任务优先级设置注意事项RTX 操作系统任务优先级的设置要注意以下几个问题: 设置任务的优先级时,数值越 ...

Incentivizing exploration in reinforcement learning with deep predictive models

Incentivizing exploration in reinforcement learning with deep predictive models的更多相关文章

随机推荐

热门专题