https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume4/kaelbling96a-html/node26.html

【平均-打折奖励】

Schwartz [106] examined the problem of adapting Q-learning to an average-reward framework. Although his R-learning algorithm seems to exhibit convergence problems for some MDPs, several researchers have found the average-reward criterion closer to the true problem they wish to solve than a discounted criterion and therefore prefer R-learning to Q-learning [69].

To discount or not to discount in reinforcement learning: A case study comparing R learning and Q learning的更多相关文章

  1. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  2. (转) Using the latest advancements in AI to predict stock market movements

    Using the latest advancements in AI to predict stock market movements 2019-01-13 21:31:18 This blog ...

  3. 强化学习(Reinfment Learning) 简介

    本文内容来自以下两个链接: https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/ https: ...

  4. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  5. (转) Deep Learning in a Nutshell: Reinforcement Learning

    Deep Learning in a Nutshell: Reinforcement Learning   Share: Posted on September 8, 2016by Tim Dettm ...

  6. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  7. Deep Reinforcement Learning: Pong from Pixels

    这是一篇迟来很久的关于增强学习(Reinforcement Learning, RL)博文.增强学习最近非常火!你一定有所了解,现在的计算机能不但能够被全自动地训练去玩儿ATARI(译注:一种游戏机) ...

  8. [DQN] What is Deep Reinforcement Learning

    已经成为DL中专门的一派,高大上的样子 Intro: MIT 6.S191 Lecture 6: Deep Reinforcement Learning Course: CS 294: Deep Re ...

  9. Deep Reinforcement Learning 基础知识

    Introduction 深度增强学习Deep Reinforcement Learning是将深度学习与增强学习结合起来从而实现从Perception感知到Action动作的端对端学习的一种全新的算 ...

随机推荐

  1. 51Nod 1019 逆序数(线段树)

    题目链接:逆序数 模板题. #include <bits/stdc++.h> using namespace std; #define rep(i, a, b) for (int i(a) ...

  2. POJ 3249 Test for Job (dfs + dp)

    题目链接:http://poj.org/problem?id=3249 题意: 给你一个DAG图,问你入度为0的点到出度为0的点的最长路是多少 思路: 记忆化搜索,注意v[i]可以是负的,所以初始值要 ...

  3. 多字节与UTF-8、Unicode之间的转换

    from http://blog.csdn.net/frankiewang008/article/details/12832239 // 多字节编码转为UTF8编码 bool MBToUTF8(vec ...

  4. 怎样去主动拿一个锁并占有?synchronized关键字即可

    怎样主动去拿一个?synchronized关键字即可 怎样去释放一个锁呢?要求锁对象主动释放,打乱占有当前锁的线程即可

  5. NetBeans菜单栏字体太小了

    NetBeans菜单栏字体太小了,导致很难看 解决方法:在netbeans的快捷方式内加入"netbeans.exe" --fontsize 12参数.还可以通过配置NetBean ...

  6. pandas常用操作

    删除某列: concatdfs.drop('Unnamed: 0',axis=1) 打印所有列名: .columns

  7. ylbtech-czgfh(规范化)-数据库设计

    ylbtech-DatabaseDesgin:ylbtech-czgfh(规范化)-数据库设计 DatabaseName:czgfh(财政规范化) Model:账户模块.系统时间设计模块.上报自评和审 ...

  8. Windows网络编程 2 【转】

    Windows网络编程使用winsock.Winsock是一个基于Socket模型的API,在Windows系统中广泛使用.使用Winsock进行网络编程需要包含头文件Winsock2.h,需要使用库 ...

  9. Linux学习之十三-vi和vim编辑器及其快捷键

    vi和vim编辑器及其快捷键 1.vi与vim区别 它们都是多模式编辑器,不同的是vim 是vi的升级版本,它不仅兼容vi的所有指令,而且还有一些新的特性在里面. vim的这些优势主要体现在以下几个方 ...

  10. C 语言实例

    C 语言实例 C 语言实例 - 输出 "Hello, World!" C 语言实例 - 输出整数 C 语言实例 - 两个数字相加 C 语言实例 - 两个浮点数相乘 C 语言实例 - ...