【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction
Unsupervised Visual Representation Learning by Context Prediction
Note here: it's a learning note on unsupervised learning model from Prof. Gupta's group.
Motivation:
- Similar to most motivations of unsupervised learning method, cut it out here.
Proposed Model:
- Given one central patch of the object, and another one arounding it, the model must guess the relative spatial configuration between these two patches.
- Intuition: when human doing this assignment, we get higher accuracy once we recognize what object it is and what it’s like with a whole look. That is to say, a model plays well on this game would have percepted the features of each object.
(i.e. we can get right answer for the following quizz once we recognize what objects they are.)


So the unsupervised representation learning can also be formulated as learning an embedding where images that are semantically similar close, while semantically different ones are far apart.
- Pipline:
- Feed two patches into a parallel convolutional network which share parameters.
- Fuse the feature vector of each patch and pass through stacked fully connected layers.
- Come out with an eight-dimension vector that predicts relative spatial configuration between the two patches.
- Compute loss, gradients and back propagate through this network to update weights.

Aoiding “trivial” solutions:
We need to preprocess images to avoid the model learns some trivial features, like:
- Low-level cues like boundary patterns or textures continuing between patches, which could potentially serve as a shortcut.
- Chromatic aberration: it arises from differences in the way the lens focuses light and different wavelengths. In some cameras, one color channel (commonly green) is shrunk toward the image center relative to the others. Once the network learns the absolute location on the lens, solving the relatve location task becomes trivial.
【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction的更多相关文章
- 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos
Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...
- 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics
Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...
- 论文解读《Momentum Contrast for Unsupervised Visual Representation Learning》俗称 MoCo
论文题目:<Momentum Contrast for Unsupervised Visual Representation Learning> 论文作者: Kaiming He.Haoq ...
- Microsoft Azure Web Sites应用与实践【3】—— 通过Visual Studio Online在线编辑Microsoft Azure 网站
Microsoft Azure Web Sites应用与实践 系列: [1]—— 打造你的第一个Microsoft Azure Website [2]—— 通过本地IIS 远程管理Microsoft ...
- Momentum Contrast for Unsupervised Visual Representation Learning (MoCo)
Momentum Contrast for Unsupervised Visual Representation Learning 一.Methods Previously Proposed 1. E ...
- Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning 一.Methods Previously Proposed 1. E ...
- 论文阅读(Xiang Bai——【arXiv2016】Scene Text Detection via Holistic, Multi-Channel Prediction)
Xiang Bai--[arXiv2016]Scene Text Detection via Holistic, Multi-Channel Prediction 目录 作者和相关链接 方法概括 创新 ...
- 【VBS】使用Visual Studio调试VBS程序
首先要确保机器上安装了Visual Stuido, 然后打开命令行窗口执行如下命令,会弹出是否使用Visual Studio进行调试的确认窗口. 点[是]进行调试. WScript.exe [vbs文 ...
- 论文阅读笔记(五)【CVPR2012】:Large Scale Metric Learning from Equivalence Constraints
由于在读文献期间多次遇见KISSME,都引自这篇CVPR,所以详细学习一下. Introduction 度量学习在机器学习领域有很大作用,其中一类是马氏度量学习(Mahalanobis metric ...
随机推荐
- 探索哪个进程使磁盘I/O升高
如果生产环境中磁盘使用率突然升高,却不知道因为哪个应用程序导致的,这个时候我们可以使用pidstat命令来查看,比如 Linux .el7.x86_64 (ip.ec2.internal) _x86_ ...
- 【转】handbrake使用教程
原文地址http://tieba.baidu.com/p/2399590151?pn=1 现在的很多压制教程基本都是使用megui或者mediacoder的,这两个软件使 ...
- 【微信JSSDK】PHP版微信录音文件下载
微信的录音文件上传到微信服务器上,只能保存三天. 因此需要做一个转存到自己服务器,或者七牛云的操作. 转存到自己服务器 调用微信JSSDK API 录音, 录音结束,上传到微信服务器,获取录音文件的 ...
- php数据库单例模式理解
单例模式(职责模式): 简单的说,一个对象(在学习设计模式之前,需要比较了解面向对象思想)只负责一个特定的任务: 单例类: 1.构造函数需要标记为private(访问控制:防止外部代码使用new操作符 ...
- ES5-ES6-ES7_let关键字声明变量
let命令的介绍 let是ECMAScript6中新增的关键字,用于声明变量.它的用法类似于var var a = 3 let b = 4 let变量的声明 let 命令的特点不允许在同一作用域下声明 ...
- (8)Python判断结构
- Swift中的反射
原文:http://www.cocoachina.com/applenews/devnews/2014/0623/8923.html Swift 事实上是支持反射的.只是功能略弱. 本文介绍主要的反射 ...
- Http接口安全整理
1.Http接口安全概述: 1.1.Http接口是互联网各系统之间对接的重要方式之一,使用http接口,开发和调用都很方便,也是被大量采用的方式,它可以让不同系统之间实现数据的交换和共享,但由于htt ...
- HTTPS协议,SSL协议及完整交互过程
文章转自 https://blog.csdn.net/dfsaggsd/article/details/50910999 SSL 1. 安全套接字(Secure Socket Layer ...
- 多线程操作的方法(sleep,)setPriority(Thread.MIN_PRIORITY);yield();
在多线程中所有的操作方法都是从Thread类开始的,所有的操作基本都在Thread类中. 第一取得线程名字 a,在Thread类中,可以通过getName()方法,获得线程的名字,可以通过setNam ...