[Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记

p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 15.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333; min-height: 15.0px }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }
ul.ul3 { list-style-type: square }

https://arxiv.org/pdf/1706.08249.pdf

Few-Example Object Detection with Model Communication，Xuanyi Dong, Liang Zheng, Fan Ma, Yi Yang, Deyu Meng

亮点

本文仅仅通过每个类别3-4个bounding box标注即可实现物体检测，并与其它使用大量training examples的方法性能可比
主要方法是：multi-modal learning (多模型同时训练) ＋ self-paced learning (curriculum learning)

相关工作

这里介绍几个比较容易混淆的概念，以及与他们相关的方法

弱监督物体检测：数据集的标签是不可靠的，如（x，y），y对于x的标记是不可靠的。这里的不可靠可以是标记不正确，多种标记，标记不充分，局部标记等。

标签是图像级别的类别标签[7][8][9][10][11][18][30][31][32][33][34]

半监督物体检测：半监督学习使用大量的未标记数据，以及同时使用标记数据，来进行模式识别工作。

一些训练样本只有类别标签，另外一些样本有详细的物体框和类别标注[4][5][6]

需要大量标注 (e.g., 50% of the full annotations)

每个类别只有几个物体框标注（Few-Example Object Detection with Model Communication)[12][35]

和few-shot learning 的区别：是否使用未标注数据学习

通过视频挖掘位置标注，此类方法主要针对会移动的物体[2][3][29][1]

Webly supervised learning for object detection: reduce the annotation cost by leveraging web data

方法

Basic detector: Faster RCNN & RFCN

Object proposal method: selective search & edge boxes

Annotations: when we randomly annotate approximately four images for each class, an image may contain several objects, and we annotate all the object bounding boxes.

参数更新：
更新vj：对上述损失函数进行求导，可以得到vj的解

对同一张图像i同一个模型j，如果有多个样本使得vj＝1，则只选择使Lc最小的那个样本置为1，其他置为0。gamma促使模型之间共享信息，因为vj为1时，阈值变大，图像更容易被选择到。

更新wj：与其它文章方法相同

更新yuj：为更新yuj我们需要从一组bounding box找到满足以下条件的解，

很难直接找到最优化的解。文中采用的方案是：将所有模型预测出的结果输入nms，并通过阈值只保留分数高的结果，余下的组成yuj。

去除难例：we employ a modified NMS (intersection/max(area1,area2)) to filter out the nested boxes, which usually occurs when there are multiple overlapping objects. If there are too many boxes (≥ 4) for one specific class or too many classes (≥ 4) in the image, this image will be removed. Images in which no reliable pseudo objects are found are filtered out.

实验

Compared with the-state-of-the-art (4.2 images per class is annotated)

VOC 2007: -1.1mAP, correct localization +0.9% compared with [21]
VOC 2012: -2.5mAP compared with [21], correct localization +9.8%
ILSVRC 2013: -2.4mAP compared with [21]
COCO 2014: +1.3 mAP compared with [22]

[20] V. Kantorov, M. Oquab, M. Cho, and I. Laptev, “Contextlocnet: Context-aware deep network models for weakly supervised localization,” in European Conference on Computer Vision, 2016.
[21] A. Diba, V. Sharma, A. Pazandeh, H. Pirsiavash, and L. Van Gool, “Weakly supervised cascaded convolutional networks,” 2017
[22] Y. Zhu, Y. Zhou, Q. Ye, Q. Qiu, and J. Jiao, “Soft proposal networks for weakly supervised object localization,” in International Conference on Computer Vision, 2017.

Ablation study

VOC 2007: +4.1 mAP compared with model ensemble
k number of labeled images per class; w/ image labels: image-level supervision incorporated

不足

虽然localization有一定准确率，但是难例图片漏检比较多（也就是说few example classification效果不好）。

[Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记的更多相关文章

Minimum Barrier Salient Object Detection at 80 FPS 论文阅读笔记
v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VM ...
Rank & Sort Loss for Object Detection and Instance Segmentation 论文解读（含核心源码详解）
第一印象 Rank & Sort Loss for Object Detection and Instance Segmentation 这篇文章算是我读的 detection 文章里面比较难 ...
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals 论文解读
前言事实上,Sparse R-CNN 很多地方是借鉴了去年 Facebook 发布的 DETR,当时应该也算是惊艳众人.其有两点: 无需 nms 进行端到端的目标检测将 NLP 中的 Transf ...
『计算机视觉』FPN：feature pyramid networks for object detection
对用卷积神经网络进行目标检测方法的一种改进,通过提取多尺度的特征信息进行融合,进而提高目标检测的精度,特别是在小物体检测上的精度.FPN是ResNet或DenseNet等通用特征提取网络的附加组件,可 ...
[Tensorflow] Object Detection API - predict through your exclusive model
开始预测一.训练结果 From: Testing Custom Object Detector - TensorFlow Object Detection API Tutorial p.6 训练结果 ...
论文阅读之　DECOLOR: Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation
DECOLOR: Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation Xia ...
使用TensorFlow Object Detection API+Google ML Engine训练自己的手掌识别器
上次使用Google ML Engine跑了一下TensorFlow Object Detection API中的Quick Start(http://www.cnblogs.com/take-fet ...
TensorFlow object detection API
cloud执行:https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pet ...
Object Detection with 10 lines of code - Image AI
To perform object detection using ImageAI, all you need to do is Install Python on your computer sys ...

随机推荐

21_Android中常见对话框，光传感器，通过重力感应器编写出指南针应用，帧动画，通过Jav代码的方式编写补间动画，通过XML的方式编写补间动画
1 关于常见的对话框,主要有: 常见的对话框,单选对话框,多选对话框,进度条对话框(转圈类型的),带进度条的对话框. 案例结构: 完成如下结构的案例,将所有的案例都测试一下: 2 编写MainA ...
Unity 5.X扩展编辑器之打包assetbundle
5.x的assetbundle与4.x以及之前的版本有些不同,不过本质是一样的,只不过5.x打包assetbundle更为简单和人性化了,总体来说只需要三个步骤: 第一步:创建打包资源 //这里是一个 ...
Swift基础之UIButton
//设置全局变量,将下面的替换即可 //var myButton = UIButton(); //系统生成的viewDidLoad()方法 override func viewDid ...
mysql进阶(十一)外键在数据库中的作用
MySQL外键在数据库中的作用 MySQL外键的目的是控制存储在外键表中的数据,使两张表形成关联,是MySQL数据库中非常重要的组成部分,值得我们去深入了解.那么,MySQL外键究竟起到哪些作用呢?下 ...
windows linux—unix 跨平台通信集成控制系统----系统硬件信息获取
控制集成系统需要了解系统的各项硬件信息,之前我们设计的时候,习惯使用c函数来搞,后来可能发现程序的移植性收到了一些影响,比如unix内核的一些c函数在linux下面是没有的: 比如苹果达尔文内核的如 ...
【Android 应用开发】Android 开发错误集锦
1. eclipse的Device中不显示手机在eclipse中连接不上手机,出现adb server didn't ACK fail to start daemon 错误. 出现这种原因是因为a ...
近期ubuntu 14.04 cpu占用高排障
近期linux使用总是cpu达到满值, 双核cpu其中一个核总是100%,另一个核正常.top之发现输入法框架fcitx满载,直接kill之,发现搜狗输入法不能用了,随即输入如下命令: fcitx f ...
杭电ACM 1001题
import java.util.Scanner; public class Main { public static void main(String[] args) { Scanner sc=ne ...
如何让DIV中的文字垂直居中
var h = $("div").innerHeight(); $("#text").css("font-size", h); $(&quo ...
讲解Oracle面试过程中常见的二十个问题
1.冷备份和热备份的不同点以及各自的优点解答:热备份针对归档模式的数据库,在数据库仍旧处于工作状态时进行备份.而冷备份指在数据库关闭后,进行备份,适用于所有模式的数据库.热备份的优点在于当备 ...

[Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记

[Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记的更多相关文章

随机推荐

热门专题