Unsupervised Learning of Visual Representations using Videos

Note here: it's a learning note on Prof. Gupta's novel work published on ICCV2015. It's really exciting to know how unsupervised learning method can contribute to learn visual representations! Also, Feifei-Li's group published a paper on video representation using unsupervised method in ICCV2015 almost at the same time! I also wrote a review on it, check it here!

Link: http://arxiv.org/pdf/1505.00687v2.pdf

Motivation:

- Supervised learning is popular for CNN to train an excellent model on various visual problems, while the application of unsupervised learning leaves blank.

- People learn concepts quickly without numerous instances for training, and we learning things in a dynamic, mostly unsupervised environment.

- We’re short of labeled video data to do supervised learning, but we can easily access to tons of unlabeled data through Internet, which can be made use of by unsupervised learning.

Proposed Model:

Target: learning visual representations from videos in an unsupervised way

Key idea: tracking of moving object provides supervision

Brief introduction:

- Objective function (constraint): capture the first patch p1 of a moving object, keep tracking of it and get another patch p2 after several frames, then randomly select a negative patch p- from other places. The idea of objective function constrains the distance of p1 and p2 in feature space should be shorter than distance of p1 and p-

- Selection of tracking patch: using IDT to obtain SURF interest points to find out which part of the frame moves most. Setting threshold on the ratio of SURF interest points to avoid noise and camera motion.

- Tracking: using KCF tracker to track the patch

- Overrall pipline:

Feed triplet into three identical CNN, put two fully-connected layers on the top of pooling-5 layer to project into feature space, then computing the ranking loss to back-propagate the network. (note that: these three CNN shares parameters)

Training strategy:

There’re many empirical details to train a more powerful CNN in this work, however I’m not going to dive into it, only give some brief reviews on some the trick.

- Choose of negative samples:

- Random selection in the first 10 epochs of training

- Hard negative mining in later epochs, we search for all the possible negative patches and choose the top K patches which give maximum loss

* Intuition on the result:

See from the table above, [unsup + fp(3 ensemble)] outperforms other methods on the detection task of bus, car, person and train, but falls far behind on detecting bird, cat, dog and sofa, which may give us some intuitions.

【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics

    Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...

  2. 【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs

    Unsupervised Learning of Video Representations using LSTMs Note here: it's a learning notes on new L ...

  3. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  4. 【翻译】我钟爱的Visual Studio前端开发工具/扩展

    原文:[翻译]我钟爱的Visual Studio前端开发工具/扩展 怎么样让Visual Studio更好地编写HTML5, CSS3, JavaScript, jQuery,换句话说就是如何更好地做 ...

  5. 论文解读(SimCLR)《A Simple Framework for Contrastive Learning of Visual Representations》

    1 题目 <A Simple Framework for Contrastive Learning of Visual Representations> 作者: Ting Chen, Si ...

  6. A Simple Framework for Contrastive Learning of Visual Representations

    目录 概 主要内容 流程 projection head g constractive loss augmentation other 代码 Chen T., Kornblith S., Norouz ...

  7. ZH奶酪:【阅读笔记】Deep Learning, NLP, and Representations

    中文译文:深度学习.自然语言处理和表征方法 http://blog.jobbole.com/77709/ 英文原文:Deep Learning, NLP, and Representations ht ...

  8. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  9. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

随机推荐

  1. Web服务并发I/O模型

    I/O模型: 阻塞型.非阻塞型.复用型.信号驱动型.异步 同步/异步: 关注消息通知机制 消息通知: 同步:等待对方返回消息 异步:被调用者通过状态.通知或回调机制通知调用者被调用者的运行状态 阻塞/ ...

  2. Windows和Mac浏览器启动本地程序

    前言 这几天有个需求,需要在IE上启动本地程序,就如下面一样. 一开始,我还以为IE有提供特殊的接口,类似上图中的“RunExe”,可以找了大半天觉得不对经(找不到该方法). 后来想想不对,这种方式是 ...

  3. WeakHashMap源码解读

    1. 简介 本文基于JDK8u111的源码分析WeakHashMap的一些主要方法的实现. 2. 数据结构 就数据结构来说WeakHashMap与HashMap原理差不多,都是拉链法来解决哈希冲突. ...

  4. node爬虫扒小说

    Step 1:  万年不变的初始化项目,安装依赖 cnpm i express cheerio superagent superagent-charset async -S express 就不用多说 ...

  5. UOJ 【UR #5】怎样跑得更快

    [UOJ#62]怎样跑得更快 题面 这个题让人有高斯消元的冲动,但肯定是不行的. 这个题算是莫比乌斯反演的一个非常巧妙的应用(不看题解不会做). 套路1: 因为\(b(i)\)能表达成一系列\(x(i ...

  6. centos7下安装docker(8.3容器的常用操作)

    yu我们之前已经学习了如何运行容器docker run,也学习了如何进入容器docker attach和docker exec,下面我们来学习容器的其他操作: stop/start/restart 1 ...

  7. ]remove-duplicates-from-sorted-list-ii (删除)

    题意略: 思路都在注解里: #include<iostream> #include<cstdio> using namespace std; struct ListNode { ...

  8. docker swarm英文文档学习-12-在集群模式中的Raft共识

    Raft consensus in swarm mode 在集群模式中的Raft共识 当Docker引擎在集群模式下运行时,manager节点实现Raft 共识算法来管理全局集群状态.Docker s ...

  9. jQuery 动画效果

    推荐网址:http://www.php100.com/manual/jquery/,用法教学,包括实例. 分类:显示隐藏.淡入淡出.滑动.自定义. <%@ Page Language=" ...

  10. ESP NVS

    简介:NVS的主要功能是:存储键值(存在flash上面): NVS利用spi_flash_{read|write|erase}这些API来操作数据在内存上的删改写,内存上data类型nvs子类型所代表 ...