Deep Reinforcement Learning Based Trading Application at JP Morgan Chase

https://medium.com/@ranko.mosic/reinforcement-learning-based-trading-application-at-jp-morgan-chase-f829b8ec54f2

FT released a story today about the new application that will optimize JP Morgan Chase trade execution ( Business Insider article on the same topic for readers that do not have FT subscription ). The intent is to reduce market impact and provide best trade execution results for large orders.

It is a complex application with many moving parts:

 

Its core is an RL algorithm that learns to perform the best action ( choose optimal price, duration and order size ) based on market conditions. It is not clear if it is Sarsa ( On-Policy TD Control) or Q-learning (Off-Policy Temporal Difference Control Algorithm ) as both algorithms are present in JP Morgan slides:

 

Sarsa

 

Q-learning

State consists of price series, expected spread cost, fill probability, size placed, as well as elapsed time, %progress, etc. Rewards are immediate rewards ( price spread ) and terminal ( end of episode ) rewards like completion, order duration and market penalties ( obviously those are negative rewards that punish the agent along these dimensions ).

 

Actions are memorized as weights of a Deep Neural Network — function approximation via NN is used since state, action space is too big to be handled in tabular form. We assume stochastic gradient descent is used for both feed forward and backprop operation operation ( hence Deep designation ):

 

JP Morgan is convinced this is the very first real time trading AI/ML application on Wall Street. We are assuming this is not true i.e. there are surely other players operating in this space as RL implementation to order execution is known for quite a while now ( Kearns and Nevmyvaka 2006 ).

The latest LOXM developmentswill be presented at QuantMinds Conference in Lisbon (May of 2018).

Instinet is also using Q-learning, probably for the same purpose ( market impact reduction ).

[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase的更多相关文章

  1. 【资料总结】| Deep Reinforcement Learning 深度强化学习

    在机器学习中,我们经常会分类为有监督学习和无监督学习,但是尝尝会忽略一个重要的分支,强化学习.有监督学习和无监督学习非常好去区分,学习的目标,有无标签等都是区分标准.如果说监督学习的目标是预测,那么强 ...

  2. (转) Deep Reinforcement Learning: Playing a Racing Game

    Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...

  3. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  4. (转) Playing FPS games with deep reinforcement learning

    Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...

  5. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  6. 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning

    Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...

  7. [DQN] What is Deep Reinforcement Learning

    已经成为DL中专门的一派,高大上的样子 Intro: MIT 6.S191 Lecture 6: Deep Reinforcement Learning Course: CS 294: Deep Re ...

  8. 论文笔记:Learning how to Active Learn: A Deep Reinforcement Learning Approach

    Learning how to Active Learn: A Deep Reinforcement Learning Approach 2018-03-11 12:56:04 1. Introduc ...

  9. 18 Issues in Current Deep Reinforcement Learning from ZhiHu

    深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章 深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...

随机推荐

  1. MongoDB文档的基本操作

    1. MongoDB的安装方法 (1)下载MongoDB 相应的版本: (2)设置数据文件和日志文件的存放目录: (3)启动MongoDB服务: (4)将MongoDB作为服务启动. 2. Mongo ...

  2. [洛谷 P3788] 幽幽子吃西瓜

    妖梦费了好大的劲为幽幽子准备了一个大西瓜,甚至和兔子铃仙打了一架.现在妖梦闲来无事,就蹲在一旁看幽幽子吃西瓜.西瓜可以看作一个标准的球体,瓜皮是绿色的,瓜瓤是红色的,瓜皮的厚度可视为0.妖梦恰好以正视 ...

  3. 从此web开发so easy!

    ECharts (Enterprise Charts 商业产品图表库) 基于Canvas,纯Javascript图表库,提供直观,生动,可交互,可个性化定制的数据可视化图表.创新的拖拽重计算.数据视图 ...

  4. vue中alert toast confirm loading 公用

    import Vue from 'vue' import { ToastPlugin, AlertPlugin, ConfirmPlugin, LoadingPlugin } from 'vux' / ...

  5. localStorage 设置本地缓存

    var timestamp = parseInt(Date.parse(new Date()));var btn = document.getElementById("close" ...

  6. 求1+2+……+n的和

    题目描述 求1+2+3+...+n,要求不能使用乘除法.for.while.if.else.switch.case等关键字及条件判断语句(A?B:C). class Solution { public ...

  7. 面向对象之 组合 封装 多态 property 装饰器

    1.组合 什么是组合? 一个对象的属性是来自另一个类的对象,称之为组合 为什么要用组合 组合也是用来解决类与类代码冗余的问题 3.如何用组合 # obj1.xxx=obj2''''''# class ...

  8. 过滤器 拦截器 登录login实例

    当请求来的时候,首先经过拦截器/过滤器,在经过一系列拦截器/拦截器处理后,再由再根据URL找到Servlet.执行servlet中的代码. 过滤器:按照过滤的对象类型的不同,可分为按资源名过滤和按请求 ...

  9. Java获取后台数据,动态生成多行多列复选框

    本例目标: 获取后台数据集合,将集合的某个字段,比如:姓名,以复选框形式显示在HTML页面 应用场景: 获取数据库的人员姓名,将其显示在页面,供多项选择 效果如下: 一.后台 查询数据库,返回List ...

  10. RabbitMQ 简单的消息发送与接收

    RabbitMQ是建立在AMQP(Advanced Message Queuing Protocol,高级消息队列协议)基础上的,而AMQP是建立在TCP协议之上的. 因此,RabbitMQ是需要建立 ...