Incentivizing exploration in reinforcement learning with deep predictive models
Stadie, Bradly C., Sergey Levine, and Pieter Abbeel. "Incentivizing exploration in reinforcement learning with deep predictive models." arXiv preprint arXiv:1507.00814 (2015).
作者通过模拟(状态,动作)的不确定性,从而修改reward,帮助agent进行探索。作者说用了他们的方法不用进行随机探索。该方法比较通用,适用于多种RL模型,但是要训练auto-encoder,所以也稍微有点繁琐。
实用指数:3颗星
理论指数:1颗星
创新指数:4颗星
Incentivizing exploration in reinforcement learning with deep predictive models的更多相关文章
- (zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
- 深度学习国外课程资料(Deep Learning for Self-Driving Cars)+(Deep Reinforcement Learning and Control )
MIT(Deep Learning for Self-Driving Cars) CMU(Deep Reinforcement Learning and Control ) 参考网址: 1 Deep ...
- 18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章 深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...
- (转) Deep Learning Research Review Week 2: Reinforcement Learning
Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...
- (转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
- (转) Deep Learning in a Nutshell: Reinforcement Learning
Deep Learning in a Nutshell: Reinforcement Learning Share: Posted on September 8, 2016by Tim Dettm ...
- (转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
- 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
- 论文笔记之:Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...
随机推荐
- OGG_Oracle GoldenGate简介(概念)
2014-03-01 Created By BaoXinjian
- win7 64 python2 xgboost安装
综述: 首先,关于xgboost是啥,可以看这一篇:机器学习(四)--- 从gbdt到xgboost 安装Python3 环境下的xgboost 可以通过pip install , 在网址中下载对应版 ...
- 基于Vuejs实现 Skeleton Loading 骨架图
原文地址:https://cloud.tencent.com/developer/article/1006169 https://mp.weixin.qq.com/s/qmyn6mGrO6hRKuvK ...
- spring中事务配置
1 如果在方法.类.接口上使用注解的方式声明事务,需要在配置文件中进行配置,以便通知 Spring 容器对标注 @Transactional 注解的 bean 加工处理. 首先需要引入 tx 命名空间 ...
- ASP.NET 解决URL中文乱码的解决
暂时先记录一个方法: 在Web.config文件中configuration下的system.web下加入一个配置项:globalization,主要是设置其requestEncoding,貌似中文系 ...
- PHP 如何获取二维数组中某个key的集合(高性能查找)
分享下PHP 获取二维数组中某个key的集合的方法. 具体是这样的,如下一个二维数组,是从库中读取出来的. 代码: $user = array( 0 => array( 'id' => 1 ...
- mongodb学习笔记之索引(转)
一.索引基础: MongoDB的索引几乎与传统的关系型数据库一模一样,这其中也包括一些基本的优化技巧.下面是创建索引的命令: > db.test.ensureIndex({" ...
- RednaxelaFX写的文章/回答的导航帖
https://www.zhihu.com/people/rednaxelafx/answers http://hllvm.group.iteye.com/group/topic/44381#post ...
- Oracle PLSQL Demo - 18.02.管道function[查询零散的字段组成list管道返回] [字段必须对上]
--PACKAGE CREATE OR REPLACE PACKAGE test_141215 is TYPE type_ref IS record( ENAME ), SAL )); TYPE t_ ...
- 使用 Electron 构建桌面应用(拖动控制篇)
使用 Electron 构建桌面应用(拖动控制篇) 当窗口被定义了大小,我们也就是在自定义这个窗口,使得它不可拉伸没有框架,让它看起来就像一个真正的声效器浮在桌面上. 现在问题来了 – 要如何移动或者 ...