Unsupervised Learning of Visual Representations using Videos

Note here: it's a learning note on Prof. Gupta's novel work published on ICCV2015. It's really exciting to know how unsupervised learning method can contribute to learn visual representations! Also, Feifei-Li's group published a paper on video representation using unsupervised method in ICCV2015 almost at the same time! I also wrote a review on it, check it here!

Link: http://arxiv.org/pdf/1505.00687v2.pdf

Motivation:

- Supervised learning is popular for CNN to train an excellent model on various visual problems, while the application of unsupervised learning leaves blank.

- People learn concepts quickly without numerous instances for training, and we learning things in a dynamic, mostly unsupervised environment.

- We’re short of labeled video data to do supervised learning, but we can easily access to tons of unlabeled data through Internet, which can be made use of by unsupervised learning.

Proposed Model:

Target: learning visual representations from videos in an unsupervised way

Key idea: tracking of moving object provides supervision

Brief introduction:

- Objective function (constraint): capture the first patch p1 of a moving object, keep tracking of it and get another patch p2 after several frames, then randomly select a negative patch p- from other places. The idea of objective function constrains the distance of p1 and p2 in feature space should be shorter than distance of p1 and p-

- Selection of tracking patch: using IDT to obtain SURF interest points to find out which part of the frame moves most. Setting threshold on the ratio of SURF interest points to avoid noise and camera motion.

- Tracking: using KCF tracker to track the patch

- Overrall pipline:

Feed triplet into three identical CNN, put two fully-connected layers on the top of pooling-5 layer to project into feature space, then computing the ranking loss to back-propagate the network. (note that: these three CNN shares parameters)

Training strategy:

There’re many empirical details to train a more powerful CNN in this work, however I’m not going to dive into it, only give some brief reviews on some the trick.

- Choose of negative samples:

- Random selection in the first 10 epochs of training

- Hard negative mining in later epochs, we search for all the possible negative patches and choose the top K patches which give maximum loss

* Intuition on the result:

See from the table above, [unsup + fp(3 ensemble)] outperforms other methods on the detection task of bus, car, person and train, but falls far behind on detecting bird, cat, dog and sofa, which may give us some intuitions.

【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics

    Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...

  2. 【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs

    Unsupervised Learning of Video Representations using LSTMs Note here: it's a learning notes on new L ...

  3. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  4. 【翻译】我钟爱的Visual Studio前端开发工具/扩展

    原文:[翻译]我钟爱的Visual Studio前端开发工具/扩展 怎么样让Visual Studio更好地编写HTML5, CSS3, JavaScript, jQuery,换句话说就是如何更好地做 ...

  5. 论文解读(SimCLR)《A Simple Framework for Contrastive Learning of Visual Representations》

    1 题目 <A Simple Framework for Contrastive Learning of Visual Representations> 作者: Ting Chen, Si ...

  6. A Simple Framework for Contrastive Learning of Visual Representations

    目录 概 主要内容 流程 projection head g constractive loss augmentation other 代码 Chen T., Kornblith S., Norouz ...

  7. ZH奶酪:【阅读笔记】Deep Learning, NLP, and Representations

    中文译文:深度学习.自然语言处理和表征方法 http://blog.jobbole.com/77709/ 英文原文:Deep Learning, NLP, and Representations ht ...

  8. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  9. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

随机推荐

  1. zabbix使用自定义脚本监控内存

    我这里的脚本是监控centos7系统的内存.centos7系统的内存如何查看我之前的博客都是有的.这里直接写了监控步骤 1.首先是编写脚本. #!/bin/bash mem_total(){ TOTA ...

  2. 在同一个服务器(同一个IP)为不同域名绑定的免费SSL证书

    越来越多的浏览器不在支持http协议了,这就要求你为你的网站必须绑定SSL证书.谷歌浏览器也将要在今年取消对http协议的支持,申请CA证书迫在眉睫.我购买有两个域名,一个虚拟机,没事鼓捣鼓捣,图个乐 ...

  3. 最好的8个 Java RESTful 框架

    原文出处: colobu 过去的每一年,涌现出越来越多的Java框架.就像JavaScript,每个人都认为他们知道一个好的框架的功能应该是怎么样的.连我的老祖母现在也使用 一个我从来没有听说过而且可 ...

  4. M100 (0)开发

    [SDCC 2015现场]大疆Paul Yang:多旋翼飞行器的未来就是机器人的未来 http://www.csdn.net/article/2015-11-19/2826268 开发官网 https ...

  5. ROS教程5 使用串口

    http://blog.csdn.net/u011853479/article/details/51261704 http://blog.csdn.net/u011853479/article/det ...

  6. Spring配置文件中的那些标签意味着什么(持续更新)

    前言 在看这边博客时,如果遇到有什么不清楚的地方,可以参考我另外一边博文.Spring标签的探索,根据这边文章自己来深入源码一探究竟.这里自己只是简单记录一下各标签作用,每个人困惑不同,自然需求也不一 ...

  7. Spring容器AOP的实现原理——动态代理(转)

    文章转自http://blog.csdn.net/liushuijinger/article/details/37829049#comments

  8. java 中,如何获取文件的MD5值呢?如何比较两个文件是否完全相同呢?

    /** * Get MD5 of one file:hex string,test OK! * * @param file * @return */ public static String getF ...

  9. Java中static、final、static final的区别【转】

    说明:不一定准确,但是最快理解. final: final可以修饰:属性,方法,类,局部变量(方法中的变量) final修饰的属性的初始化可以在编译期,也可以在运行期,初始化后不能被改变. final ...

  10. 这款 WordPress商用插件 0day 漏洞满满,且已遭利用

    Wordfence 安全研究员发布报告称,WordPress 商用插件 Total Donations 受多个 0day 漏洞的影响,且这些漏洞已遭利用. 这些严重的漏洞影响所有已知的 Total D ...