Unsupervised Learning of Spatiotemporally Coherent Metrics

Note here: it's a learning note on the topic of unsupervised learning on videos, a novel work published by Yann LeCun's group.

Link: http://arxiv.org/pdf/1412.6056.pdf

Motivation:

Temporal coherence is a form of weak supervision, which they exploit to learn generic signal representations that are stable with respect to the variability in natural video, including local deformations.

This induces the assumption that data samples that are temporal neighbors are also likely to be neighbors in the latent space.

(The invariant features in temporal sequences are also called slow features.)

Proposed Model:

The loss function based on temporal coherence is shown below:

The first term denotes neighbor frames should be similar to maintain the slowness, but in case of the network learns a constant mapping, they add the second term to force frames at different time steps to be separated by at least a distance of m-units in feature space.

However, the second term only provides the discriminative criteria on pairwise distances in the feature space. This paper argues this discriminative constraint is too weak. Thus, they introduce a reconstruction term not only prevents the constant solution but also acts to explicitly preserve information about the input. So the new loss function is:

(The first term is reconstruction term, the second one is to train slow features. And \(a|h_{r}|\) denotes sparsity penalty term.)

The overall pipeline is shown below:

Tricks:

They leverage several intuitions and tricks in the paper, but as the limitation of knowledge in this field, I can just dive into one of these.

Pooling plays an important role in the architecture. Training through a local pooling operator enforces a local topology on the hidden activations, inducing units that are pooled together to learn complimentary features.

Also, pooling in space and across features when we use convolutional architecture can produce more invariant features.

【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

    Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...

  2. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  3. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  4. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

  5. 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角

    [论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys  ...

  6. 论文阅读笔记(三)【AAAI2017】:Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image

    Introduction (1)IVPR问题: 根据一张图片从视频中识别出行人的方法称为 image to video person re-id(IVPR) 应用: ① 通过嫌犯照片,从视频中识别出嫌 ...

  7. 【转载】Deep Learning(深度学习)学习笔记整理

    http://blog.csdn.net/zouxy09/article/details/8775360 一.概述 Artificial Intelligence,也就是人工智能,就像长生不老和星际漫 ...

  8. 【转】Deep Learning(深度学习)学习笔记整理系列之(八)

    十.总结与展望 1)Deep learning总结 深度学习是关于自动学习要建模的数据的潜在(隐含)分布的多层(复杂)表达的算法.换句话来说,深度学习算法自动的提取分类需要的低层次或者高层次特征. 高 ...

  9. 【CV】ICCV2015_Describing Videos by Exploiting Temporal Structure

    Describing Videos by Exploiting Temporal Structure Note here: it's a learning note on the topic of v ...

随机推荐

  1. VS2015 无法启动 IIS Express Web 服务器 解决方案

    VS2015 IIS Express 无法启动Web 解决方案 [亲测已成功] 1.我的电脑—管理—事件查看器—Windows日志—应用程序: 详细信息会提示你:[模块 DLL C:\Program ...

  2. Android清理设备内存具体完整演示样例(二)

    版权声明: https://blog.csdn.net/lfdfhl/article/details/27672913 MainActivity例如以下: package cc.c; import j ...

  3. pku1365 Prime Land (数论,合数分解模板)

    题意:给你一个个数对a, b 表示ab这样的每个数相乘的一个数n,求n-1的质数因子并且每个指数因子k所对应的次数 h. 先把合数分解模板乖乖放上: ; ans != ; ++i) { ) { num ...

  4. P1754 球迷购票问题

    题目背景 盛况空前的足球赛即将举行.球赛门票售票处排起了球迷购票长龙. 按售票处规定,每位购票者限购一张门票,且每张票售价为50元.在排成长龙的球迷中有N个人手持面值50元的钱币,另有N个人手持面值1 ...

  5. mysql5.7配置文件优化

    [mysqld] pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock datadir = /var/l ...

  6. Arduino IDE for ESP8266 (0) 官方API

    http://arduino-esp8266.readthedocs.io/en/latest/esp8266wifi/readme.html 0 简单的连接到WIFI #include <ES ...

  7. 在centos7上修改docker加速镜像为阿里云

    使用docker pull,命令下载镜像太慢了,默认是从国外的,本文记录下如何配置国内阿里云竞相加速方式. 登录https://cr.console.aliyun.com,如下, 阿里云会为每个用户提 ...

  8. nodejs中引用其他js文件中的函数

    基本语句 require('js文件路径'); 使用方法 举个例子,在同一个目录下,有app.fun1.fun2三个js文件. 1. app.js var fun1 = require('./fun1 ...

  9. PAT A1013 Battle Over Cities (25 分)——图遍历,联通块个数

    It is vitally important to have all the cities connected by highways in a war. If a city is occupied ...

  10. 【Codeforces 113B】Petr#

    Codeforces 113 B 题意:有一个母串\(S\)以及两个串\(S_{begin}\)和\(S_{end}\),问\(S\)中以\(S_{begin}\)为开头并且以\(S_{end}\)为 ...