Unsupervised Learning of Spatiotemporally Coherent Metrics

Note here: it's a learning note on the topic of unsupervised learning on videos, a novel work published by Yann LeCun's group.

Link: http://arxiv.org/pdf/1412.6056.pdf

Motivation:

Temporal coherence is a form of weak supervision, which they exploit to learn generic signal representations that are stable with respect to the variability in natural video, including local deformations.

This induces the assumption that data samples that are temporal neighbors are also likely to be neighbors in the latent space.

(The invariant features in temporal sequences are also called slow features.)

Proposed Model:

The loss function based on temporal coherence is shown below:

The first term denotes neighbor frames should be similar to maintain the slowness, but in case of the network learns a constant mapping, they add the second term to force frames at different time steps to be separated by at least a distance of m-units in feature space.

However, the second term only provides the discriminative criteria on pairwise distances in the feature space. This paper argues this discriminative constraint is too weak. Thus, they introduce a reconstruction term not only prevents the constant solution but also acts to explicitly preserve information about the input. So the new loss function is:

(The first term is reconstruction term, the second one is to train slow features. And \(a|h_{r}|\) denotes sparsity penalty term.)

The overall pipeline is shown below:

Tricks:

They leverage several intuitions and tricks in the paper, but as the limitation of knowledge in this field, I can just dive into one of these.

Pooling plays an important role in the architecture. Training through a local pooling operator enforces a local topology on the hidden activations, inducing units that are pooled together to learn complimentary features.

Also, pooling in space and across features when we use convolutional architecture can produce more invariant features.

【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

    Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...

  2. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  3. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  4. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

  5. 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角

    [论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys  ...

  6. 论文阅读笔记(三)【AAAI2017】:Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image

    Introduction (1)IVPR问题: 根据一张图片从视频中识别出行人的方法称为 image to video person re-id(IVPR) 应用: ① 通过嫌犯照片,从视频中识别出嫌 ...

  7. 【转载】Deep Learning(深度学习)学习笔记整理

    http://blog.csdn.net/zouxy09/article/details/8775360 一.概述 Artificial Intelligence,也就是人工智能,就像长生不老和星际漫 ...

  8. 【转】Deep Learning(深度学习)学习笔记整理系列之(八)

    十.总结与展望 1)Deep learning总结 深度学习是关于自动学习要建模的数据的潜在(隐含)分布的多层(复杂)表达的算法.换句话来说,深度学习算法自动的提取分类需要的低层次或者高层次特征. 高 ...

  9. 【CV】ICCV2015_Describing Videos by Exploiting Temporal Structure

    Describing Videos by Exploiting Temporal Structure Note here: it's a learning note on the topic of v ...

随机推荐

  1. Linux运维之每日小技巧-检测网站状态以及PV、UV等介绍

    [root@ELK-chaofeng07 httpd]# curl -o /dev/null -w %{http_code}\\n -s www.baidu.com 状态码为200表示成功. PV.U ...

  2. MySQL安全模式:sql_safe_updates讲解

    什么是安全模式 在mysql中,如果在update和delete没有加上where条件,数据将会全部修改.不只是初识mysql的开发者会遇到这个问题,工作有一定经验的工程师难免也会忘记写入where条 ...

  3. ABAP 内表访问表达式的性能

    内表访问表达式是ABAP 7.4中引入的重要特性,可以使语句变得更加简洁.美观.那么它的读写性能怎么样呢?我进行了一点点测试. 读取 测试代码,使用三种方式读取同一内表,分别是read table关键 ...

  4. Redis的安装和Jedis的使用

    Redis的安装和学习资料 Redis的安装可以参考 https://www.cnblogs.com/dddyyy/p/9763098.html Redis的学习可以参考https://www.cnb ...

  5. python五十八课——正则表达式(分组)

    演示正则中的替换和切割操作:在这之前我们先学习一个分组的概念: 分组:在正则中定义(...)就可以进行分组,理解为得到了一个子组好处:1).如果正则中的逻辑比较复杂,使用分组就可以优化代码的阅读性(更 ...

  6. 移动端自适应rem布局

    补充一个基本知识,不许笑,1rem等于HTML中设置的字体大小(px) 首先,HTML 的 head 部分中加入如下代码: <meta name="viewport" con ...

  7. day10,11练习

    1.执行Python脚本两种方式? 略 2.简述位.字节关系? 8位一个字节 3.简述ASCII,Unicode,utf-8,gbk关系? ascii unicode utf8 4.请写出李杰分别用u ...

  8. 遇到的web请求错误码集合与解释

    302 临时移动.与301类似.但资源只是临时被移动.客户端应继续使用原有URI

  9. Linux中进程与线程的概念以及区别

    linux进程与线程的区别,早已成为IT界经常讨论但热度不减的话题.无论你是初级程序员,还是资深专家,都应该考虑过这个问题,只是层次角度不同罢了.对于一般的程序员,搞清楚二者的概念并在工作中学会运用是 ...

  10. SQL 清理缓存 更新无效

    --查询结果1 select * from Student where ID='CCB87B71-FB78-4BFE-8692-24DD2D8F8460' --查询结果2 where ID='CCB8 ...