Deep Reinforcement Learning Based Trading Application at JP Morgan Chase

https://medium.com/@ranko.mosic/reinforcement-learning-based-trading-application-at-jp-morgan-chase-f829b8ec54f2

FT released a story today about the new application that will optimize JP Morgan Chase trade execution ( Business Insider article on the same topic for readers that do not have FT subscription ). The intent is to reduce market impact and provide best trade execution results for large orders.

It is a complex application with many moving parts:

Its core is an RL algorithm that learns to perform the best action ( choose optimal price, duration and order size ) based on market conditions. It is not clear if it is Sarsa ( On-Policy TD Control) or Q-learning (Off-Policy Temporal Difference Control Algorithm ) as both algorithms are present in JP Morgan slides:

Sarsa

Q-learning

State consists of price series, expected spread cost, fill probability, size placed, as well as elapsed time, %progress, etc. Rewards are immediate rewards ( price spread ) and terminal ( end of episode ) rewards like completion, order duration and market penalties ( obviously those are negative rewards that punish the agent along these dimensions ).

Actions are memorized as weights of a Deep Neural Network — function approximation via NN is used since state, action space is too big to be handled in tabular form. We assume stochastic gradient descent is used for both feed forward and backprop operation operation ( hence Deep designation ):

JP Morgan is convinced this is the very first real time trading AI/ML application on Wall Street. We are assuming this is not true i.e. there are surely other players operating in this space as RL implementation to order execution is known for quite a while now ( Kearns and Nevmyvaka 2006 ).

The latest LOXM developmentswill be presented at QuantMinds Conference in Lisbon (May of 2018).

Instinet is also using Q-learning, probably for the same purpose ( market impact reduction ).

[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase的更多相关文章

【资料总结】| Deep Reinforcement Learning 深度强化学习
在机器学习中,我们经常会分类为有监督学习和无监督学习,但是尝尝会忽略一个重要的分支,强化学习.有监督学习和无监督学习非常好去区分,学习的目标,有无标签等都是区分标准.如果说监督学习的目标是预测,那么强 ...
(转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
(转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
(转) Playing FPS games with deep reinforcement learning
Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...
(zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
论文笔记之：Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
[DQN] What is Deep Reinforcement Learning
已经成为DL中专门的一派,高大上的样子 Intro: MIT 6.S191 Lecture 6: Deep Reinforcement Learning Course: CS 294: Deep Re ...
论文笔记：Learning how to Active Learn: A Deep Reinforcement Learning Approach
Learning how to Active Learn: A Deep Reinforcement Learning Approach 2018-03-11 12:56:04 1. Introduc ...
18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...

随机推荐

LINQ 初步了解
.NET Framework 3.5的新特性 Language Integrated Query,即语言集成查询查询和语言结合关系数据库里的信息使用的XML文档保存在本地的DataSet内存中的L ...
WEB UI 界面打印PDF
项目上看到的,感觉很厉害的样子,所以要存档... 说一下思路:画的SF,然后在WDA里调用SF,产生PDF数据流,然后在WDA里用PDF展示出来,UI调用... COMPONENTCONTROLLER ...
NPM 使用及npm升级中问题解决
NPM是随同NodeJS一起安装的包管理工具,能解决NodeJS代码部署上的很多问题,常见的使用场景有以下几种: 允许用户从NPM服务器下载别人编写的第三方包到本地使用. 允许用户从NPM服务器下载并 ...
SQL - 数据查询
数据查询是数据库的核心操作.SQL 提供了 select 语句进行数据查询,该语句的一般格式为: select [ ALL | distinct ] <目标列表达式> [ ,<目 ...
Oracle 11.2.0.4.0 Dataguard部署和日常维护(4)-Datauard Gap事件解决篇
Oracle dataguard主库删除备库需要的归档时,会导致gap事情的产生,或者备库由于网络或物理故障原因,倒是备库远远落后于主库,都会产生gap事件,本例模拟gap事件的产生以及处理. 1. ...
hdu多校1002 Balanced Sequence
Balanced Sequence Time Limit: / MS (Java/Others) Memory Limit: / K (Java/Others) Total Submission(s) ...
记录一下ES6扩展运算符（三点运算符）...的用法
...运算符用于操作数组,有两种层面 1. 第一个叫做展开运算符(spread operator),作用是和字面意思一样,就是把东西展开.可以用在array和object上都行. 比如: let a ...
ConcurrentHashMap1.8源码解析
深入并发包 ConcurrentHashMap 概述 JDK1.8的实现已经摒弃了Segment的概念,而是直接用Node数组+链表+红黑树的数据结构来实现,并发控制使用Synchronized和CA ...
Linux 控制CPU使用率
曾经看过<编程之美>上提到说使 CPU的使用率固定在百分之多少.然后这次刚好要用到这个东西,下面是一个简单的实现.基于多线程: Linux 版本: #include <iostrea ...
Boost中的网络库ASIO，nginx
boost C++ 本身就是跨平台的,在Linux.Unix.Windos上都可以使用. Boost.Asio 针对网络编程,很多服务端C++开发使用此库. 这个库在以下的平台和编译器上测试通过: ...

[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase

Deep Reinforcement Learning Based Trading Application at JP Morgan Chase

[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase的更多相关文章

随机推荐

热门专题