Unsupervised Visual Representation Learning by Context Prediction

Note here: it's a learning note on unsupervised learning model from Prof. Gupta's group.

Link: http://120.52.73.9/www.cv-foundation.org/openaccess/content_iccv_2015/papers/Doersch_Unsupervised_Visual_Representation_ICCV_2015_paper.pdf

Motivation:

- Similar to most motivations of unsupervised learning method, cut it out here.

Proposed Model:

- Given one central patch of the object, and another one arounding it, the model must guess the relative spatial configuration between these two patches.

- Intuition: when human doing this assignment, we get higher accuracy once we recognize what object it is and what it’s like with a whole look. That is to say, a model plays well on this game would have percepted the features of each object.

(i.e. we can get right answer for the following quizz once we recognize what objects they are.)

So the unsupervised representation learning can also be formulated as learning an embedding where images that are semantically similar close, while semantically different ones are far apart.

- Pipline:

  • Feed two patches into a parallel convolutional network which share parameters.
  • Fuse the feature vector of each patch and pass through stacked fully connected layers.
  • Come out with an eight-dimension vector that predicts relative spatial configuration between the two patches.
  • Compute loss, gradients and back propagate through this network to update weights.

Aoiding “trivial” solutions:

We need to preprocess images to avoid the model learns some trivial features, like:

- Low-level cues like boundary patterns or textures continuing between patches, which could potentially serve as a shortcut.

- Chromatic aberration: it arises from differences in the way the lens focuses light and different wavelengths. In some cameras, one color channel (commonly green) is shrunk toward the image center relative to the others. Once the network learns the absolute location on the lens, solving the relatve location task becomes trivial.

【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

    Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...

  2. 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics

    Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...

  3. 论文解读《Momentum Contrast for Unsupervised Visual Representation Learning》俗称 MoCo

    论文题目:<Momentum Contrast for Unsupervised Visual Representation Learning> 论文作者: Kaiming He.Haoq ...

  4. Microsoft Azure Web Sites应用与实践【3】—— 通过Visual Studio Online在线编辑Microsoft Azure 网站

    Microsoft Azure Web Sites应用与实践 系列: [1]—— 打造你的第一个Microsoft Azure Website [2]—— 通过本地IIS 远程管理Microsoft ...

  5. Momentum Contrast for Unsupervised Visual Representation Learning (MoCo)

    Momentum Contrast for Unsupervised Visual Representation Learning 一.Methods Previously Proposed 1. E ...

  6. Momentum Contrast for Unsupervised Visual Representation Learning

    Momentum Contrast for Unsupervised Visual Representation Learning 一.Methods Previously Proposed 1. E ...

  7. 论文阅读(Xiang Bai——【arXiv2016】Scene Text Detection via Holistic, Multi-Channel Prediction)

    Xiang Bai--[arXiv2016]Scene Text Detection via Holistic, Multi-Channel Prediction 目录 作者和相关链接 方法概括 创新 ...

  8. 【VBS】使用Visual Studio调试VBS程序

    首先要确保机器上安装了Visual Stuido, 然后打开命令行窗口执行如下命令,会弹出是否使用Visual Studio进行调试的确认窗口. 点[是]进行调试. WScript.exe [vbs文 ...

  9. 论文阅读笔记(五)【CVPR2012】:Large Scale Metric Learning from Equivalence Constraints

    由于在读文献期间多次遇见KISSME,都引自这篇CVPR,所以详细学习一下. Introduction 度量学习在机器学习领域有很大作用,其中一类是马氏度量学习(Mahalanobis metric ...

随机推荐

  1. Http协议响应状态类别及说明

    HTTP响应由三个部分组成,分别是:状态行.消息报头.响应正文  状态行格式如下: HTTP-VersionStatus-Code Reason-Phrase CRLF 其中,HTTP-Version ...

  2. Java基本数据类型转换

    一:Java的基本数据类型和引用数据类型 1:基本数据类型 2:引用数据类型 二:基本数据的类型转换 基本数据类型中,布尔类型boolean占有一个字节,由于其本身所代码的特殊含义,boolean类型 ...

  3. golang的一些基础数据类型转换

    int -- string //string到int value_int,err:=strconv.Atoi(string) //int到string str:=strconv.Itoa(value_ ...

  4. 今天遇到一件开心事,在eclipse编写的代码在命令窗口中编译后无法运行,提示 “错误: 找不到或无法加载主类”

    java中带package和不带package的编译运行方式是不同的. 首先来了解一下package的概念:简单定义为,package是一个为了方便管理组织java文件的目录结构,并防止不同java文 ...

  5. Intent加强

    Intent是一种运行时绑定(runtime binding)机制,它能在程序运行的过程中连接两个不同的组件.通过Intent,你的程序可以向Android表达某种请求或者意愿,Android会根据意 ...

  6. 控件布局_FrameLayout(网格布局)

    <FrameLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools=&qu ...

  7. python 之 递归

    终于来到了这里,这是一座山,山那边都是神仙 定义:在一个函数里调用函数本身 最好的例子就是,求阶乘 def factorial(n): if n == 1: return 1 elif n > ...

  8. 转://Linux MultiPath多路径软件实施说明

    Multipath的工作原理 当multipath启动的时候,它通过系统命令scsi_id -eg -s /block/sdX得到proc/partitions 里面所有块设备的 UUID(unive ...

  9. oracle 11gR2 ASM添加和删除磁盘

    一.环境 oracle 11gR2 RAC + Oracle Linux Server release 5.9 二.实施 备注:安全起见,操作之前停数据库实例.ASM实例 1.节点1.2磁盘信息 -- ...

  10. day14 Python百分号字符串拼接

    拼接 # -*- coding:utf8 -*- #%s字符串,%d数字msg = '%s am %s my %s is %s'% (2,"charon","pluto& ...