Multi-shot Pedestrian Re-identification via Sequential Decision Making

Multi-shot Pedestrian Re-identification via Sequential Decision Making

2019-07-31 20:33:37

Paper: http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Multi-Shot_Pedestrian_Re-Identification_CVPR_2018_paper.pdf

Code: https://github.com/TuSimple/rl-multishot-reid

1. Background and Motivation:

本文引入 DRL 到 person re-ID 任务，通过序列决策来完成难易样本的识别问题。主要动机如下图所示：

2. The Proposed Method：

2.1 Image-level feature extraction:

作者对图像特征提取，采用了多个组合损失函数的形式，即：classification loss, pairwise verification loss, and triplet verification loss。用了两种经典的骨干网络，即：Inception-BN 和 AlexNet。作者将一个序列中所有图像的 feature 进行聚合，得到 l2-normalized features，即：

并且根据 l (*, *) 进行 identities 的排序，即：

2.2 Sequence Level Feature Aggregation :

作者将该问题看做是 Markov Decision Processes (MDP), 表达为 (S, A, T, R)。在每一个步骤中，agent 将会从两个输入序列中得到一个选择的图像对，来观察 state，然后选择一个动作，接下来该 agent 将会得到一个奖励 r。在此之后，如果序列没有结束，该智能体将会接收下一个 image pair，然后得到一个新的 state。

Actions and Transitions:

首先随机的从两个序列中，选择两个图像，构成 image pair。然后将该样本对输入到 agent 中，agent 会输出三个动作：same, different, and unsure。前两个动作将会停止当前的 episode，然后即可输出当前的结果。作者认为当智能体收集到了足够的信息，并且足够自信来进行决策的时候，就可以及时停止以避免不必要的计算代价。如果智能体选择的 action 是 unsure，那么我们将会选择其他的 image pair 来进行判别。

Rewards：我们定义如下的奖励情况：

如果 agent 给定的结果和 gt 一致，那么给定 +1 的奖励；

如果 at 与 gt 不同，奖励将是 -1；要么当 t = $t_{max}$ 时，at 仍然是 unsure 的时候；

当 t < $t_{max}$，$a_t$ 是 unsure 的时候，奖励是 $r_p$ ；

这里的 rp 可能是 + 也可能是 -，具体看情况：If rp is negative, it will be penalized for requesting more pairs; on the other hand, if rp is positive, we encourage the agent to gather more pairs, and stop gathering when it has collected $t_{max}$ pairs to avoid a penalty of -1. 这个值，将会极大地影响最终 agent 的行为。

States and Deep Q-learning:

我们使用 deep Q-learning 来找到最优的策略。对于每一个 state and action $(s_t, a_t)$, $Q(s_t, a_t)$ 代表了当前状态和动作下的折扣的累积奖励。在训练阶段，我们可以迭代的更新 Q-function：

在时刻 t，状态 st 由如下的三个部分构成：

1). the first part is the observation $o_t$，即图像的特征；

2). the second part is a weighted average of the difference between historical image features of two sequences; 权重计算方法如下：

3). we also augment the image features with hand-crafted features for better discrimination.

3. Experimental Results：

Multi-shot Pedestrian Re-identification via Sequential Decision Making的更多相关文章

Parallel Gradient Boosting Decision Trees
本文转载自:链接 Highlights Three different methods for parallel gradient boosting decision trees. My algori ...
ICCV 2017论文分析（文本分析）标题词频分析这算不算大数据第一步：数据清洗（删除作者和无用的页码）
IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017. IEE ...
ICLR 2013 International Conference on Learning Representations深度学习论文papers
ICLR 2013 International Conference on Learning Representations May 02 - 04, 2013, Scottsdale, Arizon ...
metasploit-post模块信息
Name Disclosure Date Rank Description ---- ...
Andrew Ng机器学习公开课笔记–Reinforcement Learning and Control
网易公开课,第16课 notes,12 前面的supervised learning,对于一个指定的x可以明确告诉你,正确的y是什么但某些sequential decision making问题,比 ...
Learning Structured Representation for Text Classification via Reinforcement Learning 学习笔记
Representation learning : 表征学习,端到端的学习 pre-specified 预先指定的 demonstrate 论证;证明,证实;显示,展示;演示,说明 attempt ...
David Silver强化学习Lecture1：强化学习简介
课件:Lecture 1: Introduction to Reinforcement Learning 视频:David Silver深度强化学习第1课 - 简介 (中文字幕) 强化学习的特征作为 ...
（转）Applications of Reinforcement Learning in Real World
Applications of Reinforcement Learning in Real World 2018-08-05 18:58:04 This blog is copied from: h ...
论文笔记之：SeqGAN: Sequence generative adversarial nets with policy gradient
SeqGAN: Sequence generative adversarial nets with policy gradient AAAI-2017 Introduction : 产生序列模拟数 ...

随机推荐

Java 单例类
单例类:该类只能创建一个实例,或者说内存中只有一个实例,该类的对象引用的都是这个实例. 示例: package my_package; //定义一个单例类 class Singleton{ //使用一 ...
idea2019的安装与激活
1.安装及相关资料下载链接:https://pan.baidu.com/s/1njKjorAvaWftuGCvCQzP3A 提取码:r8h8 2.安装步骤几乎是傻瓜式的安装,点击下一步即可注意两 ...
【函数】wm_concat包的订制
[函数]wm_concat包的订制 1 BLOG文档结构图 2 前言部分 2.1 导读和注意事项各位技术爱好者,看完本文后,你可以掌握如下的技能,也可以学到一些其它你所不知道 ...
k8s node节点部署(v1.13.10)
系统环境: node节点操作系统: CentOS-7-x86_64-DVD-1908.iso node节点 IP地址: 192.168.1.204 node节点 hostname(主机名, 请和保持 ...
Docker的系统资源限制及验正
Docker的系统资源限制及验正作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.容器资源限制概述 1>.什么是"Limit a container's reso ...
二进制搭建Kubernetes集群(最新v1.16.0版本)
目录 1.生产环境k8s平台架构 2.官方提供三种部署方式 3.服务器规划 4.系统初始化 5.Etcd集群部署 5.1.安装cfssl工具 5.2.生成etcd证书 5.2.1 创建用来生成 CA ...
Hbase架构与原理（转）
Hbase架构与原理 HBase是一个分布式的.面向列的开源数据库,该技术来源于 Fay Chang所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”.就像Bigtable利 ...
CefSharp 与 js 相互调用及注意事项
CefSharp 与 js 相互调用一. CefSharp调用 js CefSharp.WinForms.ChromiumWebBrowser wb; ... 方式1. ExecuteScriptA ...
myBatis框架之入门(四)
Mybatis多表管理查询多表关联关系分析: 多表关联:至少两个表关联.分析多表关系的经验技巧:从一条记录出发,不要从表整体去分析,比如分析A表和B表关系,A表中的一条记录对应B表中的几条记录,如果 ...
Laravel —— 多模块开发
Laravel 框架比较庞大,更适用于比较大的项目. 为了整个项目文件结构清晰,不同部分分为不同模块很有必要. 一.安装扩展包 1.根据不同 Laravel 版本,选择扩展包版本. packagest ...

Multi-shot Pedestrian Re-identification via Sequential Decision Making

Multi-shot Pedestrian Re-identification via Sequential Decision Making的更多相关文章

随机推荐

热门专题