论文阅读：Deep Attentive Tracking via Reciprocative Learning

Deep Attentive Tracking via Reciprocative Learning

2018-11-14 13:30:36

Paper: https://arxiv.org/abs/1810.03851

Project page: https://ybsong00.github.io/nips18_tracking/index

Code: https://github.com/shipubupt/NIPS2018

是的，我跟好多人一样，被标题中的 “Reciprocative Learning” 给弄懵逼了，听过 Meta-learning，Reinforcement Learning 等各种学习方法，这个 “Reciprocative” 还是第一次见到（T_T）。行吧，不多扯了，看正文。

本文基于 tracking-by-detection 框架提出一种新颖的 Attention 机制来复制跟踪，实现在跟踪过程中自动 attend target object regions。网络结构看起来也比较简单，如下所示：

本文利用 attention map 来作为 “regularization terms” 来帮助分类器更加关注 target object，从而对 appearance change，更加鲁棒。在测试阶段，作者直接用 the classification score 来定位目标物体。

1. Attention Exploiting：

我们首先展示我们如何将 visual attention 结合到 tracking-by-detection framework 中。我们将输入记为 I，网络的输出是关于 score 的向量（a vector of scores）。每一个元素得分代表了，I 有多像某一个预先设定的类别 c。给定一个特定的样本 I₀，我们利用一级泰勒展开式来估计 score function $f_c (I)$ at a point $z_0$：

其中，point $z_0$ 属于 $I_0$ 删除的 $\in$-neighborhood 。公式（1）的估计对于任何的 $I_0$ 近邻的任何 point 都是成立的。所以，$f_c(I)$ 在 points $z_0$ 和 $I_0$ 的导数是相等的，因为这两个点是无限接近的。在公式（1）中，对应输入 I 在样本 $I_0$ 的导数是 Ac：

公式（1）表明类别 c 的输出得分是受到 Ac 的 element values 影响的。也就是说，Ac 的值表明了 I₀ 中对应像素的重要性，来产生类别得分。如此，我们可以将 Ac 看做是一种 attention map。对于另一种特定输入图像 I₁，我们再利用泰勒展开式在 point z1，来估计 $f_c (I)$。点 z1 属于删除的 I₁ 近邻。对于所有的 I₁ 近邻的所有点，这个估计都是成立的，所以，对于每一个 image sample 来说，the attention map Ac 都是特定的。

根据公式（2），我们计算网络的输出 $f_c (I)$ 对应 input I 的偏导。这个可以通过如下两个步骤来实现：

step-1，我们将输入样本 $I_0$ 输入到网络中，得到预测的 score $f_c (I_0)$ ；

step-2，我们得到 $f_c (I)$ 对 I 的偏导。根据链式法则，我们偏导可以通过反向传播进行计算。我们将第一层的输出，在反向传播过程中，当做是 attention map Ac。

我们仅仅选择是 positive values 的梯度，因为他们有对最终的分类有明显的贡献。所以，the attention map Ac 总是 positive 的，并且反映了网络是如何 attend 输入样本 $I_0$ 的。注意到，在反向传播过程中，我们将网络的参数固定，不进行更新。

2. Attention Regularization：

Tracking-by-detection framework 通常定义 target object 为 positive class，背景物体作为 negative class 来进行二元分类器的训练。对于每一个 input sample $I_0$，我们得到两个 attention maps。一个是 the positive attention map，记为 Ap，另一个为 negative attention map，记为An。

对于一个正的训练样本来说，我们期望 Ap 中跟 target object 相关的物体尽可能的大。作为对比，An 的像素值尽可能的小。所以，the attention 正则化项应该定义为：

其中，$\mu$ and $\delta$ 是均值和标注差操作符。另一个方面，对于 negative training samples，我们构建对应的正则化项为：

利用公式（3）（4），我们添加这两项到原始的分类 loss 函数中，得到：

公式（5）表明了 attention map 是如何影响到 deep classifier 的训练。

For positive samples, we aim to increase the attention around the target object in two aspects. The first one is to increase the mean but decrease the standard deviation of A p so that the pixel intensity values are large and with small variance. The second one is to decrease the mean but increase the standard deviation of An so that the pixel intensity values are small and with large variance. These two aspects reflect that the classifier learns to increase the true positive rates while decreasing the false negative rates.

A similar intuition is shown in Eq. 4 where we decrease the false positive rates and increase the true negative rates of the classifier. As a result, the regularization terms help in increasing the classification accuracy by using the constraint from attention maps. This contributes to the classifier training process as the attention maps heavily influence the output class scores as shown in Eq. 1.

3. Reciprocative Learning：

通过将正则化项结合到 loss function 中，我们借助于 BP算法以及链式法则来实现 Reciprocative Learning。

本文的算法仅仅在训练阶段使用，使得 classifier 选择性的对 target object 比较关注，而逐渐忽略 background 物体。上图展示了这个大致的过程。

4. Tracking Process：

4.1 Model Initialization：

根据所提出的 training samples 与 GT 之间的 IoU 来决定 pos 和 neg samples 的划分（阈值设定为 0.5）。

4.2 Online Detection：

根据上一帧跟踪的结果，我们首先采样出 N2 个样本输入到 model 中，并且选择带有最大响应得分的 proposal。BBox regression 模型也被应用到 BBox 的调整中，以得到更加准确的结果。

4.3 Model Update：

每隔 T 帧更新一次模型，更新 fc layers H2 次。

5. Experiments：

论文阅读：Deep Attentive Tracking via Reciprocative Learning的更多相关文章

论文笔记：Deep Attentive Tracking via Reciprocative Learning
Deep Attentive Tracking via Reciprocative Learning NIPS18_tracking Type:Tracking-By-Detection 本篇论文地主 ...
Deep attention tracking via Reciprocative Learning
文章:Deep attention tracking via Reciprocative Learning 出自NIPS2018 文章链接:https://arxiv.org/pdf/1810.038 ...
[论文阅读] Deep Residual Learning for Image Recognition(ResNet)
ResNet网络,本文获得2016 CVPR best paper,获得了ILSVRC2015的分类任务第一名. 本篇文章解决了深度神经网络中产生的退化问题(degradation problem). ...
论文阅读笔记十六：DeconvNet:Learning Deconvolution Network for Semantic Segmentation(ICCV2015)
论文源址:https://arxiv.org/abs/1505.04366 tensorflow代码:https://github.com/fabianbormann/Tensorflow-Decon ...
论文阅读 | Transformer-XL: Attentive Language Models beyond a Fixed-Length Context
0 简述 Transformer最大的问题:在语言建模时的设置受到固定长度上下文的限制. 本文提出的Transformer-XL,使学习不再仅仅依赖于定长,且不破坏时间的相关性. Transforme ...
【论文阅读】PBA-Population Based Augmentation:Efficient Learning of Augmentation Policy Schedules
参考 1. PBA_paper; 2. github; 3. Berkeley_blog; 4. pabbeel_berkeley_EECS_homepage; 完
论文阅读： A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms
机器人学习操纵综述:挑战,表示形式和算法 1.介绍因此,研究人员专注于机器人应如何学习操纵周围世界的问题. 这项研究的范围很广,从学习个人操作技巧到人类演示,再到学习适用于高级计划的操作任务的抽象描 ...
[论文阅读笔记] node2vec Scalable Feature Learning for Networks
[论文阅读笔记] node2vec:Scalable Feature Learning for Networks 本文结构解决问题主要贡献算法原理参考文献 (1) 解决问题由于DeepWal ...
Deep Reinforcement Learning for Dialogue Generation 论文阅读
本文来自李纪为博士的论文 Deep Reinforcement Learning for Dialogue Generation. 1,概述当前在闲聊机器人中的主要技术框架都是seq2seq模型.但 ...

随机推荐

<?php if($value['udertype'] == 0) {?> <td>超级管理员</td> <?php } else if ($value['udertype'] == 1)
<?php if($value['udertype'] == 0) {?> <td>超级管理员</td> <?php } else if ($value['u ...
【1】public
[面向对象] 李坤是不是人?(人类) 飞飞是不是人?(人类) 扎心是不是人?(人类) 是:特指某一个事物属于:同一的类型什么是对象: 就是特指的某一个东西,万物皆对象什么是类: 具有一批相同属性 ...
前端上传 base64 编码图片到七牛云存储
参考文档如何上传base64编码图片到七牛云调试过程文档中分别有 java 和 html 的 demo,可以根据文档示例调试. 下面是我调试的过程,可以作为参考,特别注意的是,如果需要给文件起名 ...
swust oj 956
约瑟夫问题的实现 2000(ms) 65535(kb) 3266 / 10775 n个人围成一个圈,每个人分别标注为1.2.....n,要求从1号从1开始报数 ,报到k的人出圈,接着下一个人又从1开始 ...
微信小程序中显示与隐藏(hidden)
1.wx.wxml页面部分 <view bindtap='click'>点击</view> //这是显示隐藏的部分 <view hidden="{{hidden ...
使用 jQuery 调用 ASP.NET AJAX Page Method
文章来源:http://chungle.iteye.com/blog/406054 说到轻量级的客户端通信,我注意到大多数人喜欢使用 ASP.NET AJAX Page Method 多于 ASMX ...
ajaxFileUpload只能上传一次，和上传同名图片不能上传等bug问题
createUploadForm: function (id, fileElementId) { //create form var formId = 'jUploadForm' + id; var ...
20175320 2018-2019-2 《Java程序设计》第3周学习总结
20175320 2018-2019-2 <Java程序设计>第3周学习总结教材学习内容总结本周学习了教材的第四章的内容.在这章中介绍了面向对象编程的概念以及Java编程中的类与对象, ...
CentOS最小化系统，怎么安装图形界面
CentOS最小化系统做服务器,都是没有图形界面的.很多初学者不习惯命令行操作,那么应该怎么安装图形界面?本经验咗嚛以centos6.5系统为例方法步骤: 首先进入centos系统界面,先测 ...
Vuex之理解Getters的用法
一.什么是getters在介绍state中我们了解到,在Store仓库里,state就是用来存放数据,若是对数据进行处理输出,比如数据要过滤,一般我们可以写到computed中.但是如果很多组件都使用 ...

论文阅读：Deep Attentive Tracking via Reciprocative Learning

论文阅读：Deep Attentive Tracking via Reciprocative Learning的更多相关文章

随机推荐

热门专题