【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos
Unsupervised Learning of Visual Representations using Videos
Note here: it's a learning note on Prof. Gupta's novel work published on ICCV2015. It's really exciting to know how unsupervised learning method can contribute to learn visual representations! Also, Feifei-Li's group published a paper on video representation using unsupervised method in ICCV2015 almost at the same time! I also wrote a review on it, check it here!
Link: http://arxiv.org/pdf/1505.00687v2.pdf
Motivation:
- Supervised learning is popular for CNN to train an excellent model on various visual problems, while the application of unsupervised learning leaves blank.
- People learn concepts quickly without numerous instances for training, and we learning things in a dynamic, mostly unsupervised environment.
- We’re short of labeled video data to do supervised learning, but we can easily access to tons of unlabeled data through Internet, which can be made use of by unsupervised learning.
Proposed Model:
Target: learning visual representations from videos in an unsupervised way
Key idea: tracking of moving object provides supervision
Brief introduction:
- Objective function (constraint): capture the first patch p1 of a moving object, keep tracking of it and get another patch p2 after several frames, then randomly select a negative patch p- from other places. The idea of objective function constrains the distance of p1 and p2 in feature space should be shorter than distance of p1 and p-



- Selection of tracking patch: using IDT to obtain SURF interest points to find out which part of the frame moves most. Setting threshold on the ratio of SURF interest points to avoid noise and camera motion.
- Tracking: using KCF tracker to track the patch
- Overrall pipline:
Feed triplet into three identical CNN, put two fully-connected layers on the top of pooling-5 layer to project into feature space, then computing the ranking loss to back-propagate the network. (note that: these three CNN shares parameters)

Training strategy:
There’re many empirical details to train a more powerful CNN in this work, however I’m not going to dive into it, only give some brief reviews on some the trick.
- Choose of negative samples:
- Random selection in the first 10 epochs of training
- Hard negative mining in later epochs, we search for all the possible negative patches and choose the top K patches which give maximum loss
* Intuition on the result:

See from the table above, [unsup + fp(3 ensemble)] outperforms other methods on the detection task of bus, car, person and train, but falls far behind on detecting bird, cat, dog and sofa, which may give us some intuitions.
【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos的更多相关文章
- 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics
Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...
- 【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs
Unsupervised Learning of Video Representations using LSTMs Note here: it's a learning notes on new L ...
- 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction
Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...
- 【翻译】我钟爱的Visual Studio前端开发工具/扩展
原文:[翻译]我钟爱的Visual Studio前端开发工具/扩展 怎么样让Visual Studio更好地编写HTML5, CSS3, JavaScript, jQuery,换句话说就是如何更好地做 ...
- 论文解读(SimCLR)《A Simple Framework for Contrastive Learning of Visual Representations》
1 题目 <A Simple Framework for Contrastive Learning of Visual Representations> 作者: Ting Chen, Si ...
- A Simple Framework for Contrastive Learning of Visual Representations
目录 概 主要内容 流程 projection head g constractive loss augmentation other 代码 Chen T., Kornblith S., Norouz ...
- ZH奶酪:【阅读笔记】Deep Learning, NLP, and Representations
中文译文:深度学习.自然语言处理和表征方法 http://blog.jobbole.com/77709/ 英文原文:Deep Learning, NLP, and Representations ht ...
- 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合
[论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...
- 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤
[论文标题]List-wise learning to rank with matrix factorization for collaborative filtering (RecSys '10 ...
随机推荐
- 数据可视化:CSV格式,JSON格式
下载CSV格式数据,进行可视化 csv.reader()创建一个与文件有关联的阅读器(reader)对象,reader处理文件中的第一行数据,并将每一项数据都存储在列表中 head_row = nex ...
- ccf--20150303--节日
本题思路:首先,计算a月1日是星期几,然后再通过b和c得出日期monday,最后判断monday是否合法. 题目与代码如下: 问题描述 试题编号: 201503-3 试题名称: 节日 时间限制: 1. ...
- Unity3d自制字体
这篇教学中会使用到BMFont 这个工具 准备好Unity5.3.2版本,其他版本会有异常 一.制作字体 下载链接: http://www.angelcode.com/products/bmfont/ ...
- 如何使用 eclipse进行断点 debug 程序
先给出一段程序,然后通过使用 eclipse 设置断点进行一步步操作看结果 package cn.debug.com; public class Demo18 { public static void ...
- 拓普微智能TFT液晶显示模块
关键词: 串口屏, 液晶屏, TFT,人机界面 概述: 智能模块(Smart LCD)是专为工业显示应用而设计的TFT液晶显示模块. 模块自带主控IC.Flash存储器.实时嵌入式操作系统,客户主机可 ...
- div放在li标签中,无法撑开li标签的问题
作为一个前端菜鸟,我又碰到问题了,今天把div放到li标签中,发现div并没有把li标签撑开,而是在li标签边界之外,具体情况如下图所示: 那么,怎样才能达到预期的效果(每个li中放置一个div标签, ...
- json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1
问题描述:使用Python代码将txt城市列表文件转换为xls文件,源码如下, #!/usr/bin/env Python # coding=utf-8 import os import json i ...
- java通过传入的日期,获取所在周的周一至周日
public static void main(String[] args) { try { SimpleDateFormat sdf=new SimpleDateFormat("yyyy- ...
- Arduino IDE for ESP8266 ()esp8266项目 WIFI攻击器
https://www.wandianshenme.com/play/esp8266-nodemcu-create-portable-wifi-jammer/ 使用 ESP8266 制作 WiFi 干 ...
- Spring配置文件中的那些标签意味着什么(持续更新)
前言 在看这边博客时,如果遇到有什么不清楚的地方,可以参考我另外一边博文.Spring标签的探索,根据这边文章自己来深入源码一探究竟.这里自己只是简单记录一下各标签作用,每个人困惑不同,自然需求也不一 ...