【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs
Unsupervised Learning of Video Representations using LSTMs
Note here: it's a learning notes on new LSTMs architecture used as an unsupervised learning way of video representations.
(More unsupervised learning related topics, you can refer to:
Learning Temporal Embeddings for Complex Video Analysis
Unsupervised Learning of Visual Representations using Videos
Unsupervised Visual Representation Learning by Context Prediction)
Link: http://arxiv.org/abs/1502.04681
Motivation:
- Understanding temporal sequences is important for solving many video related problems. We should utilize temporal structure of videos as a supervisory signal for unsupervised learning.
Proposed model:
In this paper, the author proposed three models based on LSTM:
1) LSTM Autoencoder Model:

This model is composed of two parts, the encoder and the decoder.
The encoder accepts sequences of frames as input, and the learned representation generated from encoder are copied to decoder as initial input. Then the decoder should reconstruct similar images like input frames in reverse order.
(This is called unconditional version, while a conditional version receives last generated output of decoder as input, shown as the dashed boxes below)
Intuition: The reconstruction work requires the network to capture information about the appearance of objects and the background, this is exactly the information that we would like the representation to contain.
2) LSTM Future Predictor Model:

This model is similar with the one above. The main difference lies in the output. Output of this model is the prediction of frames that come just after the input sequences. It also varies with conditional/unconditional versions just like the description above.
Intuition: In order to predict the next few frames correctly, the model needs information about which objects are present and how they are moving so that the motion can be extrapolated.
3) A Composite Model:

This model combines "input reconstruction" and "future prediction" together to form a more powerful model. These two modules share a same encoder, which encodes input sequences into a feature vector and copy them to different decoders.
Intuition: this only encoder learns representations that contain not only static appearance of objects&background, but also the dynamic informations like moving objects and their moving pattern.
【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs的更多相关文章
- 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos
Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...
- 论文阅读笔记(三)【AAAI2017】:Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image
Introduction (1)IVPR问题: 根据一张图片从视频中识别出行人的方法称为 image to video person re-id(IVPR) 应用: ① 通过嫌犯照片,从视频中识别出嫌 ...
- ZH奶酪:【阅读笔记】Deep Learning, NLP, and Representations
中文译文:深度学习.自然语言处理和表征方法 http://blog.jobbole.com/77709/ 英文原文:Deep Learning, NLP, and Representations ht ...
- 【ML】Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos & Towards Good Practices for ...
- 【ML】ICLR2016_Delving Deeper into Convolutional Networks
ICLR2016_DELVING DEEPER INTO CONVOLUTIONAL NETWORKS Note here: Ballas recently proposed a novel fram ...
- 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合
[论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...
- 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤
[论文标题]List-wise learning to rank with matrix factorization for collaborative filtering (RecSys '10 ...
- 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角
[论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys ...
- 【ML】Predict and Constrain: Modeling Cardinality in Deep Structured Prediction -预测和约束:在深度结构化预测中建模基数
[论文标题]Predict and Constrain: Modeling Cardinality in Deep Structured Prediction (35th-ICML,PMLR) [ ...
随机推荐
- 【PAT】B1069 微博转发抽奖(20 分)
一开始并没有做出来,关键是没有认真理解题,每次做题之前都应该认真读题,自己把样例模拟一下,防止漏掉信息,减慢自己写代码的速度 此题的重点在于规划逻辑,以及如何储存中奖者,用map最好,否则查找并不方便 ...
- Java-栈的学习(字符串的反转)
StackX类 public class StackX{ private int maxSize; private char StackArray[]; private int top; public ...
- react redux学习之路
React 自学 chapter one React新的前端思维方式 React的首要思想是通过组件(Component)来开发应用.所谓组件,简单说,指的是能够完成某个特定功能的独立的.可重用的代码 ...
- python五十六课——正则表达式(常用函数之search())
函数:search(regex,string,[flags=0]):参数:和match一样理解功能:从头开始匹配字符串中的数据,如果头不匹配继续往后尝试匹配,直到有第一个匹配成功的子数据,立即返回一个 ...
- 2018-2019-2 网络对抗技术 20165318 Exp3 免杀原理与实践
2018-2019-2 网络对抗技术 20165318 Exp3 免杀原理与实践 免杀原理及基础问题回答 实验内容 任务一:正确使用msf编码器,msfvenom生成如jar之类的其他文件,veil- ...
- Linux(CentOS)上配置 SFTP(附解决Write failed: Broken pipe Couldn't read packet: Connection reset by peer)
#创建sftp组: groupadd sftp #创建一个用户sftpuser: useradd -g sftp -s /bin/false sftpuser #提示: /etc/group 文件包含 ...
- npm 常用命令 查看版本、安装、卸载
npm list // 查看本地已安装模块清单 npm list [packageName] // 查看本地已安装模块版本 npm info [packageName] //查看模块的详细信息 包括各 ...
- c# 根据当前时间获取,本周,本月,本季度,月初,月末,各个时间段
DateTime dt = DateTime.Now; //当前时间 DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss") //24小时制 ...
- https://leetcode.com/problems/palindromic-substrings/description/
https://www.cnblogs.com/grandyang/p/7404777.html 博客中写的<=2,实际上<=1也是可以的 相当于判断一个大指针内所有子字符串是否可能为回文 ...
- Qt+QGIS二次开发:QGIS中使用QgsRubberBand类创建临时图形
1 概述 临时图形Rubberband主要用于高亮显示.交互绘制等情况下.2 原理 临时图形是在一个底色透明的图层(顶层)上,添加已有的几何元素或者创建一个几何元素(临时图形),可以设置相应的样式, ...