p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 16.0px }
p.p5 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXGeneral; color: #323333 }
p.p6 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px STIXGeneral; color: #323333 }
p.p7 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px "Helvetica Neue"; color: #323333; min-height: 20.0px }
p.p8 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXSizeOneSym; color: #323333 }
p.p9 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px STIXGeneral; color: #323333 }
p.p10 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p11 { margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px STIXGeneral; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
span.s3 { vertical-align: -0.5px }
span.s4 { vertical-align: -9.0px }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }

https://www.csee.umbc.edu/~hpirsiav/papers/cascade_cvpr17.pdf

Weakly Supervised Cascaded Convolutional Networks, Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash and Luc Van Gool

亮点

  • 通过多任务叠加(分类,分割)提高了多物体弱监督检测的正确率
  • 通过利用segmentation筛选纯净的proposals,得到了更鲁棒的结果
  • 为弱监督分割任务设计比较鲁棒的loss
    • 只考虑全局的分类结果和置信度对高的部分
    • 通过loss的weights关注到最需要关注的部分

相关工作 

One of the most common approaches [7] consists of the following steps:

  • generates object proposals,
  • extracts features from the proposals,
  • applies multiple instance learning (MIL) to the features and finds the box labels from the weak bag (image) labels.

弱监督物体检测难点: 弱监督物体检测对初始化要求很高,不好的初始化可能会使网络陷入局部最优解,解决的办法主要有以下几个:

  • improve the initialization [31, 9, 28, 29]
  • regularizing the optimization strategies [4, 5, 7]
  • [17] employ an iterative self-learning strategy to employ harder samples to a small set of initial samples
  • [15] use a convex relaxation of soft-max loss

Majority of the previous works [25, 32] use a large collection of noisy object proposals to train their object detector. In contrast, our method only focuses on a very few clean collection of object proposals that are far more reliable, robust, computationally efficient, and gives better performance

方法

Two-stage: proposal and image classification (conv1 till con5, global pooling) + multiple instance learning (2fc, score layer)

1. image classification: CNN with global average pooling (GAP) [36]中引入,将分类过程中fc层的weights作为原来convolutional layer输出的权重并将所有频道加权得到的图作为class activation map。在这一步中,还产生一个分类的loss LGAP

[36]  B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In CVPR, 2016. 3, 4, 5, 6, 7, 8

2. multiple instance learning

Proposal: edgeboxs [37] is used to generate an initial set of object proposals. Then we threshold the class activation map [36] to come up with a mask. Finally, we choose the initial boxes with largest overlap with the mask.

Three-stage:  more information about the objects’ boundary learned in a segmentation task can lead to acquisition of a better appearance model and then better object localization.

  • 主要思想:分割监督信号帮助提升定位准确率。
  • 弱分割监督信号:上一级得到的mask

实验结果

PASCAL VOC 2007

  • +3.3% classification compared with [18]
  • +1.6% correct localization compared with [27]
  • +0.6% compared with [6]

PASCAL VOC 2010

  • +3.3% compared with [6]

PASCAL VOC 2012

  • +8.8% compared with [18]
  • ILSVRC 2013
  • +5.5% compared with [18]

Object detection training

  • PASCAL VOC 2007 test set: Faster RCNN trained by the pseudo ground-truth (GT) bounding boxes generated by our cascaded networks performs slightly better than our transfered model. (+0.3%)

[6] H. Bilen and A. Vedaldi. Weakly supervised deep detection networks. In CVPR, 2016. 6, 7, 8

[18] D. Li, J.-B. Huang, Y. Li, S. Wang, and M.-H. Yang. Weakly supervised object localization with progressive domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2, 6, 7

[27] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 5, 6

p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 16.0px }
p.p5 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXGeneral; color: #323333 }
p.p6 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px STIXGeneral; color: #323333 }
p.p7 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px "Helvetica Neue"; color: #323333; min-height: 20.0px }
p.p8 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXSizeOneSym; color: #323333 }
p.p9 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px STIXGeneral; color: #323333 }
p.p10 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p11 { margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px STIXGeneral; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
span.s3 { vertical-align: -0.5px }
span.s4 { vertical-align: -9.0px }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }

[CVPR2017] Weakly Supervised Cascaded Convolutional Networks论文笔记的更多相关文章

  1. [CVPR 2016] Weakly Supervised Deep Detection Networks论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 } p. ...

  2. [论文阅读] Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks(MTCNN)

    相关论文:Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks 概论 用于人脸检测和对 ...

  3. Visualizing and Understanding Convolutional Networks论文复现笔记

    目录 Visualizing and Understanding Convolutional Networks 论文复现笔记 Abstract Introduction Approach Visual ...

  4. 《Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks》

    <Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks> 论文主要的三个贡 ...

  5. Densely Connected Convolutional Networks 论文阅读

    毕设终于告一段落,传统方法的视觉做得我整个人都很奔溃,终于结束,可以看些搁置很久的一些论文了,嘤嘤嘤 Densely Connected Convolutional Networks 其实很早就出来了 ...

  6. 【Semantic Segmentation】 Instance-sensitive Fully Convolutional Networks论文解析(转)

    这篇文章比较简单,但还是不想写overview,转自: https://blog.csdn.net/zimenglan_sysu/article/details/52451098 另外,读这篇pape ...

  7. 【Detection】R-FCN: Object Detection via Region-based Fully Convolutional Networks论文分析

    目录 0. Paper link 1. Overview 2. position-sensitive score maps 2.1 Background 2.2 position-sensitive ...

  8. [CVPR2015] Is object localization for free? – Weakly-supervised learning with convolutional neural networks论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 } p. ...

  9. Bag of Tricks for Image Classification with Convolutional Neural Networks论文笔记

    一.高效的训练     1.Large-batch training 使用大的batch size可能会减小训练过程(收敛的慢?我之前训练的时候挺喜欢用较大的batch size),即在相同的迭代次数 ...

随机推荐

  1. 开源视频平台:Kaltura

    Kaltura是一个很优秀的开源视频平台.提供了视频的管理系统,视频的在线编辑系统等等一整套完整的系统,功能甚是强大. Kaltura不同于其他诸如Brightcove,Ooyala这样的网络视频平台 ...

  2. mysql进阶(九)多表查询

    MySQL多表查询 一 使用SELECT子句进行多表查询 SELECT 字段名 FROM 表1,表2 - WHERE 表1.字段 = 表2.字段 AND 其它查询条件 SELECT a.id,a.na ...

  3. Android群英传笔记——第十二章:Android5.X 新特性详解,Material Design UI的新体验

    Android群英传笔记--第十二章:Android5.X 新特性详解,Material Design UI的新体验 第十一章为什么不写,因为我很早之前就已经写过了,有需要的可以去看 Android高 ...

  4. Linux下进程通信方式(简要概述)

    http://blog.sina.com.cn/s/blog_65c209580100u0ee.html (1)管道(Pipe):管道可用于具有亲缘关系进程间的通信,允许一个进程和另一个与它有共同祖先 ...

  5. HBase Master 启动

    –>首先初始化HMaster –>创建一个rpcServer,其中并启动 –>启动一个Listener线程,功能是监听client的请求,将请求放入nio请求队列,逻辑如下: –&g ...

  6. Sharepoint 2010 自定义WebService 找不到网站应用程序

    错误描述:Net 开发WebService调用Microsoft.SharePoint.dll的服务器端对象模型,出现找不到网站的应用程序,或者出现500错误. 错误截图: [Webservice调用 ...

  7. leetCode之旅(5)-博弈论中极为经典的尼姆游戏

    题目介绍 You are playing the following Nim Game with your friend: There is a heap of stones on the table ...

  8. obj-c编程10:Foundation库中类的使用(2)[字符串,数组]

    Foundation库的内容不可谓不多,就算很精简的说篇幅也受不了啊!笨猫一向反对博客文章一下子拖拖拉拉写一大坨!KISS哦!so将上一篇文章再分一篇来说,于是有了这篇,可能还会有(3)哦... 我发 ...

  9. MATLAB三点确定圆

    function [circleCenter,radius] = ThreePointCircle(obj,x,y,z) A=[x(1)-y(1),x(2)-y(2);z(1)-y(1),z(2)-y ...

  10. TCP / IP,HTTP

    大学学习网络基础的时候老师讲过,网络由下往上分为物理层.数据链路层.网络层.传输层.会话层.表示层和应用层.通过初步的了解,我知道IP协议对应于网络层,TCP协议对应于传输层,而HTTP协议对应于应用 ...