Unsupervised Learning of Spatiotemporally Coherent Metrics

Note here: it's a learning note on the topic of unsupervised learning on videos, a novel work published by Yann LeCun's group.

Link: http://arxiv.org/pdf/1412.6056.pdf

Motivation:

Temporal coherence is a form of weak supervision, which they exploit to learn generic signal representations that are stable with respect to the variability in natural video, including local deformations.

This induces the assumption that data samples that are temporal neighbors are also likely to be neighbors in the latent space.

(The invariant features in temporal sequences are also called slow features.)

Proposed Model:

The loss function based on temporal coherence is shown below:

The first term denotes neighbor frames should be similar to maintain the slowness, but in case of the network learns a constant mapping, they add the second term to force frames at different time steps to be separated by at least a distance of m-units in feature space.

However, the second term only provides the discriminative criteria on pairwise distances in the feature space. This paper argues this discriminative constraint is too weak. Thus, they introduce a reconstruction term not only prevents the constant solution but also acts to explicitly preserve information about the input. So the new loss function is:

(The first term is reconstruction term, the second one is to train slow features. And \(a|h_{r}|\) denotes sparsity penalty term.)

The overall pipeline is shown below:

Tricks:

They leverage several intuitions and tricks in the paper, but as the limitation of knowledge in this field, I can just dive into one of these.

Pooling plays an important role in the architecture. Training through a local pooling operator enforces a local topology on the hidden activations, inducing units that are pooled together to learn complimentary features.

Also, pooling in space and across features when we use convolutional architecture can produce more invariant features.

【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

    Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...

  2. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  3. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  4. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

  5. 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角

    [论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys  ...

  6. 论文阅读笔记(三)【AAAI2017】:Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image

    Introduction (1)IVPR问题: 根据一张图片从视频中识别出行人的方法称为 image to video person re-id(IVPR) 应用: ① 通过嫌犯照片,从视频中识别出嫌 ...

  7. 【转载】Deep Learning(深度学习)学习笔记整理

    http://blog.csdn.net/zouxy09/article/details/8775360 一.概述 Artificial Intelligence,也就是人工智能,就像长生不老和星际漫 ...

  8. 【转】Deep Learning(深度学习)学习笔记整理系列之(八)

    十.总结与展望 1)Deep learning总结 深度学习是关于自动学习要建模的数据的潜在(隐含)分布的多层(复杂)表达的算法.换句话来说,深度学习算法自动的提取分类需要的低层次或者高层次特征. 高 ...

  9. 【CV】ICCV2015_Describing Videos by Exploiting Temporal Structure

    Describing Videos by Exploiting Temporal Structure Note here: it's a learning note on the topic of v ...

随机推荐

  1. 第 16 章 C 预处理器和 C 库(直角坐标转换极坐标)

    /*------------------------------------- rect_pol.c -- 把直角坐标转换为极坐标 ---------------------------------- ...

  2. log4.net 配置 - 自定义过滤器按LoggerName过滤日志

    自定义过滤器按LoggerName过滤日志,本来想使用 PropertyFilter 来实现,后来研究发现一直不能成功,源代码debug了一下获取一直为null,时间关系只好用 StringMatch ...

  3. 实时监听input输入的变化(兼容主流浏览器)【转】

    遇到如此需求,首先想到的是change事件,但用过change的都知道只有在input失去焦点时才会触发,并不能满足实时监测的需求,比如监测用户输入字符数. 在经过查阅一番资料后,欣慰的发现firef ...

  4. Html body的滚动条禁止与启用

    在写一个在页面中,经验证用户没有登录或session失效时候弹出登录框禁止页面滚动用到今天搞了一个功能,上下左右居中,模仿QQ空间里的样式,把横向和纵向滚动条禁止掉代码如下:<script ty ...

  5. 控件布局_RelativeLayout

    android:layout_above 将该控件的底部至于给定ID的控件之上 android:layout_below 将该控件的顶部至于给定ID的控件之下 android:layout_toLef ...

  6. 迷宫问题 dfs bfs 搜索

    定义一个二维数组: int maze[5][5] = { 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, ...

  7. centos7下安装docker(8.3容器的常用操作)

    yu我们之前已经学习了如何运行容器docker run,也学习了如何进入容器docker attach和docker exec,下面我们来学习容器的其他操作: stop/start/restart 1 ...

  8. UVA1600-Patrol Robot(BFS进阶)

    Problem UVA1600-Patrol Robot Accept:529  Submit:4330 Time Limit: 3000 mSec Problem Description A rob ...

  9. Ros使用Arduino 2 使用rosserial创建一个publisher

    1 启动arduino 将arduino开发板连接到电脑的usb口,在arduino IDE中进行设置. 选择Tools->Board,选择你所使用的arduino开发板的类型,所使用的ardu ...

  10. windows下手动安装 Apache+php+mysql

    PHP 为什么先说php,因为apache的配置要写入php的一些路径 http://php.net/downloads.php  选择windows donwload 选择Thread Safe的版 ...