Deep Reinforcement Learning
Reinforcement-Learning-Introduction-Adaptive-Computation
http://incompleteideas.net/book/bookdraft2017nov5.pdf
http://incompleteideas.net/book/ebook/the-book.html
https://www.amazon.com/Reinforcement-Learning-Introduction-Adaptive-Computation/dp/0262193981
https://orbi.ulg.ac.be/bitstream/2268/27963/1/book-FA-RL-DP.pdf
http://videolectures.net/deeplearning2017_montreal/
Reinforcement Learning--David Silver
http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html
https://www.youtube.com/watch?v=2pWv7GOvuf0
COMBINING POLICY GRADIENT AND Q-LEARNING
https://arxiv.org/pdf/1611.01626.pdf
https://www.quora.com/Whats-the-difference-between-reinforcement-Learning-and-Deep-learning
https://www.oreilly.com/ideas/reinforcement-learning-for-complex-goals-using-tensorflow
最前沿:深度学习训练方法大革新,反向传播训练不再唯一
https://zhuanlan.zhihu.com/p/22143664
最前沿:让计算机学会学习Let Computers Learn to Learn
https://zhuanlan.zhihu.com/p/21362413?refer=intelligentunit
深度增强学习之Policy Gradient方法1
https://zhuanlan.zhihu.com/p/21725498
https://deepmind.com/blog/#decoupled-neural-interfaces-using-synthetic-gradients
ore from my Simple Reinforcement Learning with Tensorflow series:
- Part 0 — Q-Learning Agents
- Part 1 — Two-Armed Bandit
- Part 1.5 — Contextual Bandits
- Part 2 — Policy-Based Agents
- Part 3 — Model-Based RL
- Part 4 — Deep Q-Networks and Beyond
- Part 5 — Visualizing an Agent’s Thoughts and Actions
- Part 6 — Partial Observability and Deep Recurrent Q-Networks
- Part 7 — Action-Selection Strategies for Exploration
- Part 8 — Asynchronous Actor-Critic Agents (A3C)
https://keon.io/deep-q-learning/
Human-level control through deep reinforcement learning
https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf
http://rll.berkeley.edu/deeprlcourse/
https://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html
如何用简单例子讲解 Q - learning 的具体过程?
https://www.zhihu.com/question/26408259
https://deeplearning4j.org/reinforcementlearning.html
https://deeplearning4j.org/neuralnet-overview.html
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/
https://medium.com/ai-society/my-first-experience-with-deep-reinforcement-learning-1743594f0361
http://neuro.cs.ut.ee/demystifying-deep-reinforcement-learning/
More from my Simple Reinforcement Learning with Tensorflow series:
- Part 0 — Q-Learning Agents
- Part 1 — Two-Armed Bandit
- Part 1.5 — Contextual Bandits
- Part 2 — Policy-Based Agents
- Part 3 — Model-Based RL
- Part 4 — Deep Q-Networks and Beyond
- Part 5 — Visualizing an Agent’s Thoughts and Actions
- Part 6 — Partial Observability and Deep Recurrent Q-Networks
- Part 7 — Action-Selection Strategies for Exploration
- Part 8 — Asynchronous Actor-Critic Agents (A3C)
Deep Reinforcement Learning 深度增强学习资源 (持续更新)
https://zhuanlan.zhihu.com/p/20885568
深度解读AlphaGo
https://zhuanlan.zhihu.com/p/20893777
深度学习论文阅读路线图 Deep Learning Papers Reading Roadmap
https://zhuanlan.zhihu.com/p/23080129
ICLR 2017 DRL相关论文
https://zhuanlan.zhihu.com/p/23807875
https://www.intelnervana.com/demystifying-deep-reinforcement-learning/
http://www.jmlr.org/papers/volume6/murphy05a/murphy05a.pdf
https://deepmind.com/research/publications/
https://deepmind.com/blog/alphago-zero-learning-scratch/
Mastering the Game of Go without Human Knowledge
https://www.nature.com/articles/doi:10.1038/nature24270
https://en.wikipedia.org/wiki/State%E2%80%93action%E2%80%93reward%E2%80%93state%E2%80%93action
DQN 从入门到放弃1 DQN与增强学习
https://zhuanlan.zhihu.com/p/21262246?refer=intelligentunit
DQN 从入门到放弃4 动态规划与Q-Learning
https://zhuanlan.zhihu.com/p/21378532?refer=intelligentunit
DQN从入门到放弃5 深度解读DQN算法
https://zhuanlan.zhihu.com/p/21421729
强化学习系列之九:Deep Q Network (DQN)
http://www.algorithmdog.com/drl
Deep Reinforcement Learning的更多相关文章
- (转) Playing FPS games with deep reinforcement learning
Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...
- (zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
- Learning Roadmap of Deep Reinforcement Learning
1. 知乎上关于DQN入门的系列文章 1.1 DQN 从入门到放弃 DQN 从入门到放弃1 DQN与增强学习 DQN 从入门到放弃2 增强学习与MDP DQN 从入门到放弃3 价值函数与Bellman ...
- (转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
- 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...
- getting started with building a ROS simulation platform for Deep Reinforcement Learning
Apparently, this ongoing work is to make a preparation for futural research on Deep Reinforcement Le ...
- (转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
- 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
- 论文笔记之:Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning Google DeepMind Abstract 主流的 Q-learning 算法过高的估计在特 ...
- 论文笔记之:Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning <Computer Science>, 2013 Abstract: 本文提出了一种深度学习方 ...
随机推荐
- RFC2616-HTTP1.1-Header Field Definitions(头字段规定部分—单词注释版)
part of Hypertext Transfer Protocol -- HTTP/1.1RFC 2616 Fielding, et al. 14 Header Field Definitions ...
- 一个完整的Java程序示例
(1) 第一个程序HelloWorld: package mypack; //相当于一个目录 public class HelloWorld{ public static void main(Stri ...
- JavaScript基础笔记(十二)Ajax
Ajax 一.XMLHttpRequest对象 一)XHR用法 var xhr = new XMLHttpRequest(); //open()方法,参数一:发送方法,参数二:请求的URL,参数三:是 ...
- VirtWire 注册教程
1 首先打开virtwire官方网站 网站地址:戳我 2 如果现实如下图所示,点击红色框中的网址连接.(网页中有Terms of Service可以读) 3 如果网页正常打开,点击“ORDERHOST ...
- PHP 入门学习教程及进阶(源于知乎网友的智慧)
思过崖历程: 自学的动机.自学的技巧.自学的目标三个方面描述学习PHP的经历 一.自学的动机: 一定要有浓厚的兴趣,兴趣是最后的老师,可以在你迷茫的时候不断地支撑着你走下去. 自学不是为了工作,不是为 ...
- [P1082][NOIP2012] 同余方程 (扩展欧几里得/乘法逆元)
最近想学数论 刚好今天(初赛上午)智推了一个数论题 我屁颠屁颠地去学了乘法逆元 然后水掉了P3811 和 P2613 (zcy吊打集训队!)(逃 然后才开始做这题. 乘法逆元 乘法逆元的思路大致就是a ...
- JavaScript踩坑
1 //这样做会抛出错误 alert(ttt); //这样做不会,只是会弹出undefine而已 alert(window.ttt); 当然可以try catch如此捕获异常 try { //这样做会 ...
- JS 私有变量
严格来讲,JS之中没有私有成员的概念:所以对象属性都是公有的.不过,倒是有一个私有变量的概念. 任何在函数中定义的变量,都可以认为是私有变量,因为不能在函数的外部访问这些变量. 私有变量包括函数的参数 ...
- yii去掉自动排序功能
Yii去掉自动排序功能并自定义排序 public function search($params) { $query = SvnManage::find()->addOrderBy([ 'cre ...
- cookies和session
基于cookies做用户验证时,敏感信息不适合放在cookies中 cookies保存在客户浏览器端的键值对 session保存在服务器端的键值对(依赖于cookies),把用户浏览器中的cook ...