[CVPR2017] Weakly Supervised Cascaded Convolutional Networks论文笔记
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 16.0px }
p.p5 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXGeneral; color: #323333 }
p.p6 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px STIXGeneral; color: #323333 }
p.p7 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px "Helvetica Neue"; color: #323333; min-height: 20.0px }
p.p8 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXSizeOneSym; color: #323333 }
p.p9 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px STIXGeneral; color: #323333 }
p.p10 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p11 { margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px STIXGeneral; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
span.s3 { vertical-align: -0.5px }
span.s4 { vertical-align: -9.0px }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }
https://www.csee.umbc.edu/~hpirsiav/papers/cascade_cvpr17.pdf
Weakly Supervised Cascaded Convolutional Networks, Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash and Luc Van Gool
亮点
- 通过多任务叠加(分类,分割)提高了多物体弱监督检测的正确率
- 通过利用segmentation筛选纯净的proposals,得到了更鲁棒的结果
- 为弱监督分割任务设计比较鲁棒的loss
- 只考虑全局的分类结果和置信度对高的部分
- 通过loss的weights关注到最需要关注的部分
相关工作
One of the most common approaches [7] consists of the following steps:
- generates object proposals,
- extracts features from the proposals,
- applies multiple instance learning (MIL) to the features and finds the box labels from the weak bag (image) labels.
弱监督物体检测难点: 弱监督物体检测对初始化要求很高,不好的初始化可能会使网络陷入局部最优解,解决的办法主要有以下几个:
- improve the initialization [31, 9, 28, 29]
- regularizing the optimization strategies [4, 5, 7]
- [17] employ an iterative self-learning strategy to employ harder samples to a small set of initial samples
- [15] use a convex relaxation of soft-max loss
Majority of the previous works [25, 32] use a large collection of noisy object proposals to train their object detector. In contrast, our method only focuses on a very few clean collection of object proposals that are far more reliable, robust, computationally efficient, and gives better performance
方法
Two-stage: proposal and image classification (conv1 till con5, global pooling) + multiple instance learning (2fc, score layer)
1. image classification: CNN with global average pooling (GAP) [36]中引入,将分类过程中fc层的weights作为原来convolutional layer输出的权重并将所有频道加权得到的图作为class activation map。在这一步中,还产生一个分类的loss LGAP
[36] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In CVPR, 2016. 3, 4, 5, 6, 7, 8
2. multiple instance learning
Proposal: edgeboxs [37] is used to generate an initial set of object proposals. Then we threshold the class activation map [36] to come up with a mask. Finally, we choose the initial boxes with largest overlap with the mask.
Three-stage: more information about the objects’ boundary learned in a segmentation task can lead to acquisition of a better appearance model and then better object localization.
- 主要思想:分割监督信号帮助提升定位准确率。
- 弱分割监督信号:上一级得到的mask
实验结果
PASCAL VOC 2007
- +3.3% classification compared with [18]
- +1.6% correct localization compared with [27]
- +0.6% compared with [6]
PASCAL VOC 2010
- +3.3% compared with [6]
PASCAL VOC 2012
- +8.8% compared with [18]
- ILSVRC 2013
- +5.5% compared with [18]
Object detection training
- PASCAL VOC 2007 test set: Faster RCNN trained by the pseudo ground-truth (GT) bounding boxes generated by our cascaded networks performs slightly better than our transfered model. (+0.3%)
[6] H. Bilen and A. Vedaldi. Weakly supervised deep detection networks. In CVPR, 2016. 6, 7, 8
[18] D. Li, J.-B. Huang, Y. Li, S. Wang, and M.-H. Yang. Weakly supervised object localization with progressive domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2, 6, 7
[27] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 5, 6
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 16.0px }
p.p5 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXGeneral; color: #323333 }
p.p6 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px STIXGeneral; color: #323333 }
p.p7 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px "Helvetica Neue"; color: #323333; min-height: 20.0px }
p.p8 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXSizeOneSym; color: #323333 }
p.p9 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px STIXGeneral; color: #323333 }
p.p10 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p11 { margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px STIXGeneral; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
span.s3 { vertical-align: -0.5px }
span.s4 { vertical-align: -9.0px }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }
[CVPR2017] Weakly Supervised Cascaded Convolutional Networks论文笔记的更多相关文章
- [CVPR 2016] Weakly Supervised Deep Detection Networks论文笔记
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 } p. ...
- [论文阅读] Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks(MTCNN)
相关论文:Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks 概论 用于人脸检测和对 ...
- Visualizing and Understanding Convolutional Networks论文复现笔记
目录 Visualizing and Understanding Convolutional Networks 论文复现笔记 Abstract Introduction Approach Visual ...
- 《Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks》
<Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks> 论文主要的三个贡 ...
- Densely Connected Convolutional Networks 论文阅读
毕设终于告一段落,传统方法的视觉做得我整个人都很奔溃,终于结束,可以看些搁置很久的一些论文了,嘤嘤嘤 Densely Connected Convolutional Networks 其实很早就出来了 ...
- 【Semantic Segmentation】 Instance-sensitive Fully Convolutional Networks论文解析(转)
这篇文章比较简单,但还是不想写overview,转自: https://blog.csdn.net/zimenglan_sysu/article/details/52451098 另外,读这篇pape ...
- 【Detection】R-FCN: Object Detection via Region-based Fully Convolutional Networks论文分析
目录 0. Paper link 1. Overview 2. position-sensitive score maps 2.1 Background 2.2 position-sensitive ...
- [CVPR2015] Is object localization for free? – Weakly-supervised learning with convolutional neural networks论文笔记
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 } p. ...
- Bag of Tricks for Image Classification with Convolutional Neural Networks论文笔记
一.高效的训练 1.Large-batch training 使用大的batch size可能会减小训练过程(收敛的慢?我之前训练的时候挺喜欢用较大的batch size),即在相同的迭代次数 ...
随机推荐
- Get/POST方法提交的长度限制
1. Get方法长度限制 Http Get方法提交的数据大小长度并没有限制,HTTP协议规范没有对URL长度进行限制.这个限制是特定的浏览器及服务器对它的限制. 如:IE对URL长度的限制 ...
- [MSSQL]SQL Server里面导出SQL脚本(表数据的insert语句)(转)
最近需要导出一个表的数据并生成insert语句,发现SQL Server的自带工具并米有此功能.BAIDU一下得到如下方法(亲测OK) 用这个存储过程可以实现:CREATE PROCEDURE dbo ...
- android 打造不同的Seekbar
最近项目需要用到双向的seekbar,网上找了好多野不能达到要求,偶然一次机会看到了大众点评的例子,然后我最他做了优化,并对常用的seekbar做了总结. 向上两张图: 比如双向seekbar pub ...
- 某集团BI决策系统建设方案分享
企业核心竞争能力的提升,需要强壮的运营管理能力,需要及时.准确.全面的业务数据分析作为参考与支撑. 某集团是大型时尚集团,内部报表系统用的QlikView,但是管理分配不够灵活,不能满足数据安全的要求 ...
- STM32F429学习笔记(一)触屏工程Keil建立
由于原来的STM32F103ZET6的flash坏掉了,所以又买了一块STM32F429DISCOVERY,这块板子非常不错,基于Cortex-M4内核,自带一块2.4寸TFT触屏,主频为180M,且 ...
- C语言设计模式-封装-继承-多态
快过年了,手头的工作慢慢也就少了,所以,研究技术的时间就多了很多时间,前些天在CSDN一博客看到有大牛在讨论C的设计模式,正好看到了,我也有兴趣转发,修改,研究一下. 记得读大学的时候,老师就告诉我们 ...
- 【Matlab编程】matlab 画图
1. 不用截图工具就可以将图保存成图像格式,并且没有背景颜色:saveas(gcf ,'outputname','png/jpg'),第三项省略时默认为fig.m文件 2. 计算形如(-1)^2/ ...
- 面试心得随谈&线程并发的总结
---恢复内容开始--- 线程同步有两种实现方式: 基于用户模式实现和用内核对象实现.前者偏于轻量级,性能也更好,但是只能用于同一进程间的线程同步,后者重量级,性能消耗更大,跨进程. 研读了一下win ...
- MQ队列管理器搭建(三)
MQ集群及网关队列管理器的搭建 描述: 如上图所示,为MQ的集群搭建部署图.CLUSTERA.CLUSTERB分别是两个集群,其中Qm1-Qm3.GateWayA为CLUSTERA集群中的队列 ...
- 掌握 Java 泛型类型(一)
为理解泛型类型为何如此有用,我们要将注意力转向 Java 语言中最容易引发错误的因素之一 - 需要不断地将表达式向下类型转换(downcast)为比其静态类型更为具体的数据类型(请参阅参考资料中的&q ...