Unsupervised Visual Representation Learning by Context Prediction

Note here: it's a learning note on unsupervised learning model from Prof. Gupta's group.

Link: http://120.52.73.9/www.cv-foundation.org/openaccess/content_iccv_2015/papers/Doersch_Unsupervised_Visual_Representation_ICCV_2015_paper.pdf

Motivation:

- Similar to most motivations of unsupervised learning method, cut it out here.

Proposed Model:

- Given one central patch of the object, and another one arounding it, the model must guess the relative spatial configuration between these two patches.

- Intuition: when human doing this assignment, we get higher accuracy once we recognize what object it is and what it’s like with a whole look. That is to say, a model plays well on this game would have percepted the features of each object.

(i.e. we can get right answer for the following quizz once we recognize what objects they are.)

So the unsupervised representation learning can also be formulated as learning an embedding where images that are semantically similar close, while semantically different ones are far apart.

- Pipline:

  • Feed two patches into a parallel convolutional network which share parameters.
  • Fuse the feature vector of each patch and pass through stacked fully connected layers.
  • Come out with an eight-dimension vector that predicts relative spatial configuration between the two patches.
  • Compute loss, gradients and back propagate through this network to update weights.

Aoiding “trivial” solutions:

We need to preprocess images to avoid the model learns some trivial features, like:

- Low-level cues like boundary patterns or textures continuing between patches, which could potentially serve as a shortcut.

- Chromatic aberration: it arises from differences in the way the lens focuses light and different wavelengths. In some cameras, one color channel (commonly green) is shrunk toward the image center relative to the others. Once the network learns the absolute location on the lens, solving the relatve location task becomes trivial.

【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

    Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...

  2. 【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics

    Unsupervised Learning of Spatiotemporally Coherent Metrics Note here: it's a learning note on the to ...

  3. 论文解读《Momentum Contrast for Unsupervised Visual Representation Learning》俗称 MoCo

    论文题目:<Momentum Contrast for Unsupervised Visual Representation Learning> 论文作者: Kaiming He.Haoq ...

  4. Microsoft Azure Web Sites应用与实践【3】—— 通过Visual Studio Online在线编辑Microsoft Azure 网站

    Microsoft Azure Web Sites应用与实践 系列: [1]—— 打造你的第一个Microsoft Azure Website [2]—— 通过本地IIS 远程管理Microsoft ...

  5. Momentum Contrast for Unsupervised Visual Representation Learning (MoCo)

    Momentum Contrast for Unsupervised Visual Representation Learning 一.Methods Previously Proposed 1. E ...

  6. Momentum Contrast for Unsupervised Visual Representation Learning

    Momentum Contrast for Unsupervised Visual Representation Learning 一.Methods Previously Proposed 1. E ...

  7. 论文阅读(Xiang Bai——【arXiv2016】Scene Text Detection via Holistic, Multi-Channel Prediction)

    Xiang Bai--[arXiv2016]Scene Text Detection via Holistic, Multi-Channel Prediction 目录 作者和相关链接 方法概括 创新 ...

  8. 【VBS】使用Visual Studio调试VBS程序

    首先要确保机器上安装了Visual Stuido, 然后打开命令行窗口执行如下命令,会弹出是否使用Visual Studio进行调试的确认窗口. 点[是]进行调试. WScript.exe [vbs文 ...

  9. 论文阅读笔记(五)【CVPR2012】:Large Scale Metric Learning from Equivalence Constraints

    由于在读文献期间多次遇见KISSME,都引自这篇CVPR,所以详细学习一下. Introduction 度量学习在机器学习领域有很大作用,其中一类是马氏度量学习(Mahalanobis metric ...

随机推荐

  1. Mysql基础之 binary关键字

    where子句的字符串比较是不区分大小写的,但是可以使用binary关键字设定where子句区分大小写

  2. fedora 配置使用点滴

    fedora 配置使用点滴 fedora 16 无线网设置 fodera16是3.x的内核,无线网卡的驱动有点不一样. 可以用如下方法安装,需要先用有线网来安装几个包,步骤如下: 执行这个命令看看网卡 ...

  3. 使用golang的slice来模拟栈

    slice(切片):底层数据结构是数组 stack(栈):一种先进后出的数据结构 普通版的模拟写入和读取的栈 package main import "fmt" //栈的特点是先进 ...

  4. 如何快速安装visual studio 2017和破解

    https://sm.myapp.com/original/Development/vs_community__1229872941.1512460494-v15.5.0.exe visual stu ...

  5. 控件_AnalogClock

    <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools= ...

  6. css权重 vs 浏览器渲染 -- css之弊病

    昨日,突现一个bug,令人十分恼火. 基本场景 自己实现一多选日历,可多选多天(相连或不相连均可)."贵司"的需求真心有些小复杂了,"市面"上没有这样的相似的东 ...

  7. Zabbix安装 Grafana安装

    每天学习一点点 编程PDF电子书免费下载: http://www.shitanlife.com/code 前提: 先需要安装好 lamp环境. 官方文档: https://www.zabbix.com ...

  8. JS进阶之---作用域,作用域链,闭包

    一.作用域: 在JavaScript中,我们可以将作用域定义为一套规则,这套规则用来管理引擎如何在当前作用域以及嵌套的子作用域中根据标识符名称进行变量查找.这里的标识符,指的是变量名或者函数名. Ja ...

  9. 转载 WebService 的CXF框架 WS方式Spring开发

    WebService 的CXF框架 WS方式Spring开发   1.建项目,导包. 1 <project xmlns="http://maven.apache.org/POM/4.0 ...

  10. go标准库的学习-database/sql

    参考:https://studygolang.com/pkgdoc 导入方式: import "database/sql" sql包提供了保证SQL或类SQL数据库的泛用接口. 使 ...