Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

2019-07-15 22:23:02

Paper: https://arxiv.org/pdf/1801.01290.pdf or Updated Version: https://arxiv.org/pdf/1812.05905.pdf

Project: https://sites.google.com/view/soft-actor-critic or https://sites.google.com/view/sac-and-applications/

TensorFlow: https://github.com/haarnoja/sac

PyTorch: https://github.com/vitchyr/rlkit

Demo video: https://www.youtube.com/channel/UCxXt8Br3-wyluz9Q08-fsaA

Good Related Blog: https://zhuanlan.zhihu.com/p/70360272

==== Video Related Tutorials (A2C, A3C):

A brief review of Actor-Critic Algorithms: 　　https://www.youtube.com/watch?v=aODdNpihRwM

CS885 Lecture 7b: Actor Critic: 　　　　　　 https://www.youtube.com/watch?v=5Ke-d1Itk3k

DRL Lecture 6: Actor-Critic: 　　　　　　　 https://www.youtube.com/watch?v=j82QLgfhFiY&t=27s

Build an A2C agent that learns to play Sonic with Tensorflow (tutorial): 　　https://www.youtube.com/watch?v=GCfUdkCL7FQ

Reinforcement Learning 6: Policy Gradients and Actor Critics (Deep Mind): 　　 https://www.youtube.com/watch?v=bRfUxQs6xIM&t=27s

Actor Critic (A3C) Tutorial: 　　　　　　　　https://www.youtube.com/watch?v=O5BlozCJBSE

Actor Critic Algorithms: 　　　　　　　　　 https://www.youtube.com/watch?v=w_3mmm0P0j8&t=2s

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor的更多相关文章

18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...
(zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
(转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
论文笔记之：Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
深度强化学习：入门(Deep Reinforcement Learning: Scratching the surface)
RL的方案两个主要对象:Agent和Environment Agent观察Environment,做出Action,这个Action会对Environment造成一定影响和改变,继而Agent会从新 ...
深度强化学习（Deep Reinforcement Learning）入门：RL base & DQN-DDPG-A3C introduction
转自https://zhuanlan.zhihu.com/p/25239682 过去的一段时间在深度强化学习领域投入了不少精力,工作中也在应用DRL解决业务问题.子曰:温故而知新,在进一步深入研究和应 ...
(转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
Deep Reinforcement Learning with Iterative Shift for Visual Tracking
Deep Reinforcement Learning with Iterative Shift for Visual Tracking 2019-07-30 14:55:31 Paper: http ...
论文笔记之：Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...

随机推荐

IVS_技术
视频监控技术按照设备发展过程分为三个阶段:模拟视频监控.数字视频监控.智能视频监控,如下图: 模拟视频监控第一代视频监控系统也叫做闭路电视监控系统,简称CCTV(Close Circuit Tele ...
MavenWrapper替代Maven
1. 说明 jdk8已经安装成功 Maven已经安装成功参见Maven Wrapper 2. Maven初始化项目注:初次执行,Maven会下载很多jar,需等待几分钟 mvn archetype ...
MySQL Replication--TABLE_ID与行格式复制
BINLOG中的TABLE_ID 在ROW格式的二进制中,事件信息中没有列的信息,需要通过Table_Map将表名对于的表信息加载到cache中,然后根据事件信息中的列下标来定位到数据列,每次表信息加 ...
图记 2016.1.7 获取本地图片、Bitmap转image
这几天完成的内容有: 1.“添加图片”按钮 2.添加图片功能遇到的问题: 我想要将添加图片按钮放在右下角,所以采用了相对布局,但是问题随之二来,因为将导航栏设置成了半透明,所以图片放到右下角之后,半 ...
Istio1.1.8部署
istio安装整体步骤: 下载 Istio 发行版. 完成必要的 Kubernetes 平台设置检查对 Pod 和服务的要求. 安装高于 2.10 版本的 Helm 客户端. 安装之前的下载和准备 ...
云计算与大数据实验：Hbase shell操作用户表
[实验目的] 1)了解hbase服务 2)学会hbase shell命令操作用户表 [实验原理] HBase是一个分布式的.面向列的开源数据库,它利用Hadoop HDFS作为其文件存储系统,利用Ha ...
linux服务器中查看图片
在图片目录下使用命令行:基于python3 python -m http.server 1 python2可能是: python -m SimpleHTTPServer 1 然后在浏览器输入服务器IP ...
嵌入式开发之移植OpenCv可执行程序到arm平台
0. 序言 PC操作系统:Ubuntu 16.04 OpenCv版本:4.0 交叉工具链:arm-linux-gnueabihf,gcc version 5.4.0 目标平台:arm 编译时间:201 ...
Codeforces G. Ciel the Commander
题目描述: Ciel the Commander time limit per test 1 second memory limit per test 256 megabytes input stan ...
Centos7安装HBase1.4
准备 1.hadoop集群已安装,这里将在Centos7安装Hadoop2.7的基础上安装hbase1.4,所以是同样的三台机器,其规划如下: hostname IP地址部署规划 node1 172 ...

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor的更多相关文章

随机推荐

热门专题