Unsupervised Learning of Spatiotemporally Coherent Metrics

Note here: it's a learning note on the topic of unsupervised learning on videos, a novel work published by Yann LeCun's group.

Link: http://arxiv.org/pdf/1412.6056.pdf

Motivation:

Temporal coherence is a form of weak supervision, which they exploit to learn generic signal representations that are stable with respect to the variability in natural video, including local deformations.

This induces the assumption that data samples that are temporal neighbors are also likely to be neighbors in the latent space.

(The invariant features in temporal sequences are also called slow features.)

Proposed Model:

The loss function based on temporal coherence is shown below:

The first term denotes neighbor frames should be similar to maintain the slowness, but in case of the network learns a constant mapping, they add the second term to force frames at different time steps to be separated by at least a distance of m-units in feature space.

However, the second term only provides the discriminative criteria on pairwise distances in the feature space. This paper argues this discriminative constraint is too weak. Thus, they introduce a reconstruction term not only prevents the constant solution but also acts to explicitly preserve information about the input. So the new loss function is:

(The first term is reconstruction term, the second one is to train slow features. And \(a|h_{r}|\) denotes sparsity penalty term.)

The overall pipeline is shown below:

Tricks:

They leverage several intuitions and tricks in the paper, but as the limitation of knowledge in this field, I can just dive into one of these.

Pooling plays an important role in the architecture. Training through a local pooling operator enforces a local topology on the hidden activations, inducing units that are pooled together to learn complimentary features.

Also, pooling in space and across features when we use convolutional architecture can produce more invariant features.

【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics的更多相关文章

  1. 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

    Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...

  2. 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction

    Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...

  3. 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

    [论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...

  4. 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤

    [论文标题]List-wise learning to rank with matrix factorization for collaborative filtering   (RecSys '10 ...

  5. 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角

    [论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys  ...

  6. 论文阅读笔记(三)【AAAI2017】:Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image

    Introduction (1)IVPR问题: 根据一张图片从视频中识别出行人的方法称为 image to video person re-id(IVPR) 应用: ① 通过嫌犯照片,从视频中识别出嫌 ...

  7. 【转载】Deep Learning(深度学习)学习笔记整理

    http://blog.csdn.net/zouxy09/article/details/8775360 一.概述 Artificial Intelligence,也就是人工智能,就像长生不老和星际漫 ...

  8. 【转】Deep Learning(深度学习)学习笔记整理系列之(八)

    十.总结与展望 1)Deep learning总结 深度学习是关于自动学习要建模的数据的潜在(隐含)分布的多层(复杂)表达的算法.换句话来说,深度学习算法自动的提取分类需要的低层次或者高层次特征. 高 ...

  9. 【CV】ICCV2015_Describing Videos by Exploiting Temporal Structure

    Describing Videos by Exploiting Temporal Structure Note here: it's a learning note on the topic of v ...

随机推荐

  1. Flask入门和快速上手

    目录 Flask入门和快速上手 python三大主流框架对比 Flask安装 依赖 可选依赖 创建flask项目 flask最小应用--hello word 非法导入名称 调试模式 路由 唯一的 UR ...

  2. VScode启动后cup100%占用的解决方法

    新安装的vscode,版本1.29.1.启动后,cpu占用一直是100%,非常的卡.百度以下,找到了解决方法,整理一下. 解决方法:在VScode中文件->首选项->设置->搜索-& ...

  3. Django应用:学习日志网站

    目录 一.创建虚拟环境(Windows) 二.创建项目 三.创建应用程序 四.创建网页:学习笔记主页 五.创建其他网页 六.用户输入数据 七.用户账户 八.让用户拥有自己的数据 九.设置应用程序样式 ...

  4. tcpdump抓包具体分析

    Tcpdump抓包分析过程   一.TCP连接建立(三次握手) 过程 客户端A,服务器B,初始序号seq,确认号ack 初始状态:B处于监听状态,A处于打开状态 A -> B : seq = x ...

  5. 2个Excel表格核对技巧

    技巧1.利用Spreadsheet Camprare一秒钟识别差异数据 如下图所示,我们如何快速比对我们自己做的表格和上司修改后的表格的差异呢?这里首先来介绍一个非常棒的工具:Spreadsheet ...

  6. zookeeper+kafka集群安装之中的一个

    版权声明:本文为博主原创文章.未经博主同意不得转载. https://blog.csdn.net/cheungmine/article/details/26678877 zookeeper+kafka ...

  7. 18核心的Intel i9将在2019年夏发布

    受工艺和架构限制,Intel HEDT发烧级桌面平台面对AMD早已经优势不再,但升级仍然在继续. 去年10月份,Intel一方面发布了第二代酷睿i9 X系列,仍然基于14nm Skylake-X架构, ...

  8. fragment The specified child already has a parent. You must call removeView()

    在切换Fragment的时候出现:The specified child already has a parent. You must call removeView()异常. 错误主要出在Fragm ...

  9. JAVA体系的线程的实现,线程的调度,状态的转换

    java体系中线程的实现 1.使用内核线程实现 内核线程就是直接由操作系统内核支持的线程,这种线程由内核来完成线程切换,内核通过操作调度器对线程进行调度,并负责将线程的任务映射到各个处理器上,每个内核 ...

  10. linux中断源码分析 - 概述(一)

    本文为原创,转载请注明:http://www.cnblogs.com/tolimit/ 关于中断和异常 一般在书中都会把中断和异常一起说明,因为它们具有相同的特点,同时也有不同的地方.在CPU里,中断 ...