【ML】ICLR2016_Delving Deeper into Convolutional Networks

ICLR2016_DELVING DEEPER INTO CONVOLUTIONAL NETWORKS

Note here: Ballas recently proposed a novel framework on learning video representation, following is the review note after reading his paper.

Link: http://arxiv.org/pdf/1511.06432v4.pdf

[Brief introduction to some neural networks]

CNN: excellent in static image classification

RNN: can understand temporal sequences in various learning tasks
(however, with exploding or vanishing weights problem)
---> LSTM/GRU are proposed to avoid this problem

RCN: leverage properties from both CNN and RNN, use CNN top level feature map as input of RNN, it has recently introduced to learn video representations.

[Video reprensentation]

Mmotivation:
Adopt RCN as basic model.
- Top-level feature map presents high sementic features, namely the spatial naunces are ignored after pooling.
- However, frame-to-frame temporal variation is known to be smooth, which is the key for action recognition from videos.
(we need a new model to adapt this problem)

[Proposed models]

GRU-RCN:
- replace recurrent units in RCN with GRU.

(z: activation gate, decides to what degree previous hidden state would contribute to the next hidden state)
(r: reset gate, decides whether or not last hidden state should be propagated into next state)
(~h: candidate hidden state, it'll pass through the activatin gate)
(h: final hidden state)

Problems:
- number of parameters in fully-connected layer is huge due to size of conv map.
- fully-connected layers break the spatial structure of conv map.

Trick:
- replace the fully-connected units in GRU with convolution operations, which can keep spatial structure and reduce number of parameters meanwhile.

Intuition:
- we can see the propagation of hidden states as a process of convolution.
if so, the next hidden state percepts spatial structure of all the previous states. as the sequence goes further, the receptive field on previous states are larger, and we only get a general concept of frames in the beginning.
- compare to our cognition system, it does make sense!

Stacked GRU-RCN:
- it applies L GRU-RCNs independently on each convolutional map.
- tile up L GRU-RCNs.
- feed L final time-step hidden states into a classifier.

【ML】ICLR2016_Delving Deeper into Convolutional Networks的更多相关文章

【ML】Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos & Towards Good Practices for ...
【论文笔记】Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 2018-01-28 15:4 ...
【ML】Predict and Constrain: Modeling Cardinality in Deep Structured Prediction -预测和约束：在深度结构化预测中建模基数
[论文标题]Predict and Constrain: Modeling Cardinality in Deep Structured Prediction (35th-ICML,PMLR) [ ...
【网络结构可视化】Visualizing and Understanding Convolutional Networks（ZF-Net）论文解析
目录 0. 论文地址 1. 概述 2. 可视化结构 2.1 Unpooling 2.2 Rectification: 2.3 Filtering: 3. Feature Visualization 4 ...
【转载】卷积神经网络（Convolutional Neural Network，CNN）
作者:wuliytTaotao 出处:https://www.cnblogs.com/wuliytTaotao/ 本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可,欢迎 ...
【翻译】给初学者的 Neural Networks / 神经网络介绍
本文翻译自 SATYA MALLICK 的 "Neural Networks : A 30,000 Feet View for Beginners" 原文链接: https:// ...
【ML】从特征分解，奇异值分解到主成分分析
1.理解特征值,特征向量一个对角阵\(A\),用它做变换时,自然坐标系的坐标轴不会发生旋转变化,而只会发生伸缩,且伸缩的比例就是\(A\)中对角线对应的数值大小. 对于普通矩阵\(A\)来说,是不是 ...
【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs
Unsupervised Learning of Video Representations using LSTMs Note here: it's a learning notes on new L ...
【ML】人脸识别
https://github.com/colipso/face_recognition https://medium.com/@ageitgey/machine-learning-is-fun-par ...

随机推荐

安全之路 —— 无DLL文件实现远程线程注入
简介在之前的章节中,笔者曾介绍过有关于远程线程注入的知识,将后门.dll文件注入explorer.exe中实现绕过防火墙反弹后门.但一个.exe文件总要在注入时捎上一个.dll文件着 ...
Tronado自定义Form组件
Tronado自定义Form组件一.获取类里面的静态属性以及动态属性的方法方式一: # ===========方式一================ class Foo(object): user ...
centos7下安装docker（3.1创建镜像commit）
docker commit创建镜像步骤:1.运行容器 2.修改容器 3.将容器保存为镜像 1. 注:-it是以交互模式进入容器,并打开终端 2.安装一个vim进行修改镜像 yum install - ...
PCB (5) 创建自己的原件库
创建如何创建创建原理图元器件库创建器件原理图创建器件PCB 如何创建器件PCB 1自己画 2修改现有 3联合PCB和原理图 1创建原理图元器件库 2创建器件原理图画图形从其他复制修改原理图 ...
[ASP.NET]ScriptManager控件使用
目录概述局部刷新错误处理类型系统扩展注册定制脚本注册 Web 服务在客户端脚本中使用认证和个性化服务 ScriptManagerProxy 类添加 ScriptManager 控件客 ...
go标准库的学习-strings-字符串操作
参考https://studygolang.com/pkgdoc 导入方式: import "strings" strings包实现了用于操作字符的简单函数. 常用的几个函数: f ...
Spring容器IOC解析及简单实现（转）
文章转自http://blog.csdn.net/liushuijinger/article/details/35978965
python基础学习第一天
def用法函数定义的基本格式如下: def function(params): somthing return values 说明:return语句可选,出现return语句表示函数 ...
【Codeforces 848C】Goodbye Souvenir
Codeforces 848 C 题意:给\(n\)个数,\(m\)个询问,每一个询问有以下类型: 1 p x:将第p位改成x. 2 l r:求出\([l,r]\)区间中每一个出现的数的最后一次出现位 ...
Linux下Samba详解及安装配置
1.简介 2.安装配置 3.在windows和linux系统上验证一.简介早期网络想要在不同主机之间共享文件大多要用FTP协议来传输,但FTP协议仅能做到传输文件却不能直接修改对方主机的资料数据, ...

【ML】ICLR2016_Delving Deeper into Convolutional Networks

【ML】ICLR2016_Delving Deeper into Convolutional Networks的更多相关文章

随机推荐

热门专题