Unsupervised Learning of Visual Representations using Videos

Note here: it's a learning note on Prof. Gupta's novel work published on ICCV2015. It's really exciting to know how unsupervised learning method can contribute to learn visual representations! Also, Feifei-Li's group published a paper on video representation using unsupervised method in ICCV2015 almost at the same time! I also wrote a review on it, check it here!

Link: http://arxiv.org/pdf/1505.00687v2.pdf

Motivation:

- Supervised learning is popular for CNN to train an excellent model on various visual problems, while the application of unsupervised learning leaves blank.

- People learn concepts quickly without numerous instances for training, and we learning things in a dynamic, mostly unsupervised environment.

- We’re short of labeled video data to do supervised learning, but we can easily access to tons of unlabeled data through Internet, which can be made use of by unsupervised learning.

Proposed Model:

Target: learning visual representations from videos in an unsupervised way

Key idea: tracking of moving object provides supervision

Brief introduction:

- Objective function (constraint): capture the first patch p1 of a moving object, keep tracking of it and get another patch p2 after several frames, then randomly select a negative patch p- from other places. The idea of objective function constrains the distance of p1 and p2 in feature space should be shorter than distance of p1 and p-

- Selection of tracking patch: using IDT to obtain SURF interest points to find out which part of the frame moves most. Setting threshold on the ratio of SURF interest points to avoid noise and camera motion.

- Tracking: using KCF tracker to track the patch

- Overrall pipline:

Feed triplet into three identical CNN, put two fully-connected layers on the top of pooling-5 layer to project into feature space, then computing the ranking loss to back-propagate the network. (note that: these three CNN shares parameters)

Training strategy:

There’re many empirical details to train a more powerful CNN in this work, however I’m not going to dive into it, only give some brief reviews on some the trick.

- Choose of negative samples:

- Random selection in the first 10 epochs of training

- Hard negative mining in later epochs, we search for all the possible negative patches and choose the top K patches which give maximum loss

* Intuition on the result:

See from the table above, [unsup + fp(3 ensemble)] outperforms other methods on the detection task of bus, car, person and train, but falls far behind on detecting bird, cat, dog and sofa, which may give us some intuitions.

【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics

    Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...

  2. 【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs

    Unsupervised Learning of Video Representations using LSTMs Note here: it's a learning notes on new L ...

  3. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  4. 【翻译】我钟爱的Visual Studio前端开发工具/扩展

    原文:[翻译]我钟爱的Visual Studio前端开发工具/扩展 怎么样让Visual Studio更好地编写HTML5, CSS3, JavaScript, jQuery,换句话说就是如何更好地做 ...

  5. 论文解读(SimCLR)《A Simple Framework for Contrastive Learning of Visual Representations》

    1 题目 <A Simple Framework for Contrastive Learning of Visual Representations> 作者: Ting Chen, Si ...

  6. A Simple Framework for Contrastive Learning of Visual Representations

    目录 概 主要内容 流程 projection head g constractive loss augmentation other 代码 Chen T., Kornblith S., Norouz ...

  7. ZH奶酪:【阅读笔记】Deep Learning, NLP, and Representations

    中文译文:深度学习.自然语言处理和表征方法 http://blog.jobbole.com/77709/ 英文原文:Deep Learning, NLP, and Representations ht ...

  8. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  9. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

随机推荐

  1. 【PAT】B1079 延迟的回文数(20 分)

    用了柳婼大佬博客的思路,但实现有不同 没有用string所以要考虑字符串末尾的'\0' 用的stl中的reverse逆置字符串 #include<stdio.h> #include< ...

  2. LNMP环境搭建详细教程

    之前有一篇博客写的是LAMP的环境搭建,今天来详细介绍一下另外一个模式——LNMP=Linux+Nginx+MySQL+PHP. 一.在Linux系统下nginx的安装过程,先到http://ngin ...

  3. vue框架简介

    MVVM框架概述 什么是vue 是一套构建用户界面的渐进式(用到哪一块就用哪一块,不需要全部用上)前端框架,Vue 的核心库只关注视图层 vue的兼容性 Vue.js 不支持 IE8 及其以下版本,因 ...

  4. [笔记]一些STL用法

    参考资料:STL 在 OI 中的应用 离散化 std::unique 功能:对有序的容器重新排列,将第一次出现的元素从前往后排,其他重复出现的元素依次排在后面 返回值:返回迭代器,迭代器指向的是重复元 ...

  5. PHP 用 fsockopen()、fputs() 来请求一个 URL,不要求返回

    项目需要,场景如下: 某个条件下需要调用接口发送多个请求执行脚本,但是由于每个请求下的脚本执行时间在半个小时左右,所以 就放弃返回执行结果,只要求能秒发送所以就可以. 代码如下: /** * 发起异步 ...

  6. centos 切换nginx跟apache环境

    启动nginx的启动 nginx -c /etc/nginx/nginx.conf 停止nginx的方法切换到apache. pkill -9 nginx 直接杀死运行中的程序,关闭nginx ser ...

  7. 快速对Mysql添加索引的五个方法

    1.添加PRIMARY KEY(主键索引) mysql>ALTER TABLE `table_name` ADD PRIMARY KEY ( `column` ) 2.添加UNIQUE(唯一索引 ...

  8. hass连接设备

    hass如何自动发现mqtt设备 https://www.hachina.io/docs/7230.html 玩家自定义 https://www.hachina.io/6572.html

  9. Ubuntu 14.04服务器配置 (1) 安装和配置

    http://jingyan.baidu.com/article/9c69d48fb9fd7b13c8024e6b.html ssh是一种安全协议,主要用于给远程登录会话数据进行加密,保证数据传输的安 ...

  10. Java的快速失败和安全失败

    文章转自https://www.cnblogs.com/ygj0930/p/6543350.html 一:快速失败(fail—fast) 在用迭代器遍历一个集合对象时,如果遍历过程中对集合对象的内容进 ...