p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee }
span.s1 { }
span.s2 { text-decoration: underline }

Is object localization for free? –Weakly-supervised learning with convolutional neural networks. Maxime Oquab, Leon Bottou, Ivan Laptev, Josef Sivic

http://www.di.ens.fr/~josef/publications/Oquab15.pdf

p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 15.0px "Helvetica Neue"; color: #323333 }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
li.li2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
span.s1 { }
span.s2 { background-color: #fefa00 }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }

亮点

  • 一个好名字给了让读者开始阅读的理由
  • global max pooling over sliding window的定位方法值得借鉴

方法

本文的目标是:设计一个弱监督分类网络,注意本文的目标主要是提升分类。因为是2015年的文章,方法比较简单原始。

Following three modifications to a classification network.

  • Treat the fully connected layers as convolutions, which allows us to deal with nearly arbitrary-sized images as input.
    • The aim is to apply the network to bigger images in a sliding window manner thus extending its output to n×m× K, where n and m denote the number of sliding window positions in the x- and y- direction in the image, respectively.
    • 3xhxw —> convs —> kxmxn (k: number of classes)
  • Explicitly search for the highest scoring object position in the image by adding a single global max-pooling layer at the output.
    • kxmxn —> kx1x1
    • The max-pooling operation hypothesizes the location of the object in the image at the position with the maximum score
  • Use a cost function that can explicitly model multiple objects present in the image.

因为图中可能有很多物体,所以多类的分类loss不适用。作者把这个任务视为多个二分类问题,loss function和分类的分数如下

p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333; min-height: 15.0px }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 15.0px "Helvetica Neue"; color: #323333 }
li.li1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 }
span.s1 { }
ul.ul1 { list-style-type: disc }

training

muti-scale test

实验

classification

  • mAP on VOC 2012 test: +3.1% compared with [56]
  • mAP on VOC 2012 test: +7.6% compared with kx1x1 output and single scale training
  • mAP on VOC: +2.6% compared with RCNN
  • mAP on COCO 62.8%

Localisation

  • Metric: if the maximal response across scales falls within the ground truth bounding box of an object of the same class within 18 pixels tolerance, we label the predicted location as correct. If not, then we count the response as a false positive (it hit the background), and we also increment the false negative count (no object was found).
  • metric on VOC 2012 val: -0.3% compared with RCNN
  • mAP on COCO 41.2%

缺点

  • 定位评测的metric不具有权威性
  • max pooling改为average pooling会不会对于多个instance的情况更好一些

[CVPR2015] Is object localization for free? – Weakly-supervised learning with convolutional neural networks论文笔记的更多相关文章

  1. Coursera, Deep Learning 4, Convolutional Neural Networks, week3, Object detection

    学习目标 Understand the challenges of Object Localization, Object Detection and Landmark Finding Underst ...

  2. 论文笔记之:Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking

    Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking  arXiv Paper ...

  3. tensorfolw配置过程中遇到的一些问题及其解决过程的记录(配置SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving)

    今天看到一篇关于检测的论文<SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real- ...

  4. [CVPR2017] Weakly Supervised Cascaded Convolutional Networks论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee } p. ...

  5. A brief introduction to weakly supervised learning(简要介绍弱监督学习)

    by 南大周志华 摘要 监督学习技术通过学习大量训练数据来构建预测模型,其中每个训练样本都有其对应的真值输出.尽管现有的技术已经取得了巨大的成功,但值得注意的是,由于数据标注过程的高成本,很多任务很难 ...

  6. [CVPR 2016] Weakly Supervised Deep Detection Networks论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #323333 } p. ...

  7. 课程四(Convolutional Neural Networks),第三 周(Object detection) —— 0.Learning Goals

    Learning Goals: Understand the challenges of Object Localization, Object Detection and Landmark Find ...

  8. [C4W3] Convolutional Neural Networks - Object detection

    第三周 目标检测(Object detection) 目标定位(Object localization) 大家好,欢迎回来,这一周我们学习的主要内容是对象检测,它是计算机视觉领域中一个新兴的应用方向, ...

  9. 论文笔记(7):Constrained Convolutional Neural Networks for Weakly Supervised Segmentation

    UC Berkeley的Deepak Pathak 使用了一个具有图像级别标记的训练数据来做弱监督学习.训练数据中只给出图像中包含某种物体,但是没有其位置信息和所包含的像素信息.该文章的方法将imag ...

随机推荐

  1. json进阶(一)js读取解析JSON类型数据

    js读取解析JSON类型数据 一.什么是JSON? JSON(JavaScript Object Notation) 是一种轻量级的数据交换格式,采用完全独立于语言的文本格式,是理想的数据交换格式,同 ...

  2. Android StringEntity() 和 UrlEncodedFormEntity() 的区别

    今天在做安卓客户端向服务器提交数据的过程中,在组织POST数据时,用了UrlEncodedFormEntity()这个方法,但是后台报错,说是无法解析json内容. 按照本来的想法,向后台发送的是 j ...

  3. 开源数字媒体资产管理系统:Razuna安装方法

    Razuna以一个使用Java语言编写的开源的数字媒体资产管理(Digital Asset Management)系统.在这里翻译一下它的安装步骤. Razuna包含以下版本: Razuna Stan ...

  4. Dynamics CRM Odata QueryUrl中的SetName问题

    用javasrcipt通过odata方式访问组织服务进行CRUD操作时,queryurl的正确拼接很关键. 以下面的url为例:"XX/XRMServices/2011/Organizati ...

  5. PR 审批界面增加显示项方法

    PR 审批界面增加显示项 解决方法 Step 1:       进入审批界面: Step 2:       在上图中,点击左下角'About this Page'查看数据源 点击上图中'Expand ...

  6. Process Order API - How To Scripts

    In this Document   Purpose   Questions and Answers   References APPLIES TO: Oracle Order Management ...

  7. 如何取得ChipmunkConstraint实例对象的私有属性

    在 如何用代码禁用SpriteBuilder中创建的关节 一篇中提到了要想禁用一个关节就需要将其无效化. 然后我们在重新创建新关节时,可以参考该关节的原始参数. 但是代码中只能直接访问到bodyA和b ...

  8. 【LaTeX排版】LaTeX论文模版

    本文是对前面LaTeX论文排版文章的总结.前面的几篇文章是分别从论文的几个方面来讲述LaTeX的排版问题,这里综合了前面的内容,给出了论文排版的模版. 模版的使用: 1.首先建立一个main.tex文 ...

  9. Leetcode_260_Single Number III

    本文是在学习中的总结,欢迎转载但请注明出处:http://blog.csdn.net/pistolove/article/details/50276549 Given an array of numb ...

  10. python--Numpy and Pandas 基本语法

    numpy和pandas是python进行数据分析的非常简洁方便的工具,话不多说,下面先简单介绍一些关于他们入门的一些知识.下面我尽量通过一些简单的代码来解释一下他们该怎么使用.以下内容并不是系统的知 ...