ICLR2016_DELVING DEEPER INTO CONVOLUTIONAL NETWORKS

Note here: Ballas recently proposed a novel framework on learning video representation, following is the review note after reading his paper.

Link: http://arxiv.org/pdf/1511.06432v4.pdf

[Brief introduction to some neural networks]

CNN: excellent in static image classification

RNN: can understand temporal sequences in various learning tasks
(however, with exploding or vanishing weights problem)
---> LSTM/GRU are proposed to avoid this problem

RCN: leverage properties from both CNN and RNN, use CNN top level feature map as input of RNN, it has recently introduced to learn video representations.

[Video reprensentation]

Mmotivation:
Adopt RCN as basic model.
- Top-level feature map presents high sementic features, namely the spatial naunces are ignored after pooling.
- However, frame-to-frame temporal variation is known to be smooth, which is the key for action recognition from videos.
(we need a new model to adapt this problem)

[Proposed models]

GRU-RCN:
- replace recurrent units in RCN with GRU.

(z: activation gate, decides to what degree previous hidden state would contribute to the next hidden state)
(r: reset gate, decides whether or not last hidden state should be propagated into next state)
(~h: candidate hidden state, it'll pass through the activatin gate)
(h: final hidden state)

Problems:
- number of parameters in fully-connected layer is huge due to size of conv map.
- fully-connected layers break the spatial structure of conv map.

Trick:
- replace the fully-connected units in GRU with convolution operations, which can keep spatial structure and reduce number of parameters meanwhile.

Intuition:
- we can see the propagation of hidden states as a process of convolution.
if so, the next hidden state percepts spatial structure of all the previous states. as the sequence goes further, the receptive field on previous states are larger, and we only get a general concept of frames in the beginning.
- compare to our cognition system, it does make sense!

Stacked GRU-RCN:
- it applies L GRU-RCNs independently on each convolutional map.
- tile up L GRU-RCNs.
- feed L final time-step hidden states into a classifier.

【ML】ICLR2016_Delving Deeper into Convolutional Networks的更多相关文章

  1. 【ML】Two-Stream Convolutional Networks for Action Recognition in Videos

    Two-Stream Convolutional Networks for Action Recognition in Videos & Towards Good Practices for ...

  2. 【论文笔记】Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

    Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 2018-01-28  15:4 ...

  3. 【ML】Predict and Constrain: Modeling Cardinality in Deep Structured Prediction -预测和约束:在深度结构化预测中建模基数

    [论文标题]Predict and Constrain: Modeling Cardinality in Deep Structured Prediction   (35th-ICML,PMLR) [ ...

  4. 【网络结构可视化】Visualizing and Understanding Convolutional Networks(ZF-Net) 论文解析

    目录 0. 论文地址 1. 概述 2. 可视化结构 2.1 Unpooling 2.2 Rectification: 2.3 Filtering: 3. Feature Visualization 4 ...

  5. 【转载】 卷积神经网络(Convolutional Neural Network,CNN)

    作者:wuliytTaotao 出处:https://www.cnblogs.com/wuliytTaotao/ 本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可,欢迎 ...

  6. 【翻译】给初学者的 Neural Networks / 神经网络 介绍

    本文翻译自 SATYA MALLICK 的  "Neural Networks : A 30,000 Feet View for Beginners" 原文链接: https:// ...

  7. 【ML】从特征分解,奇异值分解到主成分分析

    1.理解特征值,特征向量 一个对角阵\(A\),用它做变换时,自然坐标系的坐标轴不会发生旋转变化,而只会发生伸缩,且伸缩的比例就是\(A\)中对角线对应的数值大小. 对于普通矩阵\(A\)来说,是不是 ...

  8. 【ML】ICML2015_Unsupervised Learning of Video Representations using LSTMs

    Unsupervised Learning of Video Representations using LSTMs Note here: it's a learning notes on new L ...

  9. 【ML】人脸识别

    https://github.com/colipso/face_recognition https://medium.com/@ageitgey/machine-learning-is-fun-par ...

随机推荐

  1. nginx+gunicorn项目部署

    1.1安装虚拟环境 创建文件夹 mkdir data 目录文件夹 cd data 进入data文件夹 mkdir nginx 创建安装nginx的文件夹 mkdir server 存放代码的文件夹 m ...

  2. cpio的用法

    cpio 这个命令挺有趣的,因为 cpio 可以备份任何东西,包括装置设备文件.不过 cpio 有个大问题, 那就是 cpio 不会主动的去找文件来备份!啊!那怎办?所以罗,一般来说, cpio 得要 ...

  3. Deuteronomy

    You should choose the right path when you can choose, and you should choose the right path even if y ...

  4. Windows和Mac浏览器启动本地程序

    前言 这几天有个需求,需要在IE上启动本地程序,就如下面一样. 一开始,我还以为IE有提供特殊的接口,类似上图中的“RunExe”,可以找了大半天觉得不对经(找不到该方法). 后来想想不对,这种方式是 ...

  5. Java实现对zip和rar文件的解压缩

    通过java实现对zip和rar文件的解压缩

  6. C#Url处理类

    using System; using System.Text.RegularExpressions; using System.Web; using System.Collections.Speci ...

  7. Swift中的反射

    原文:http://www.cocoachina.com/applenews/devnews/2014/0623/8923.html Swift 事实上是支持反射的.只是功能略弱. 本文介绍主要的反射 ...

  8. P2430 严酷的训练 题解

    题目背景 Lj的朋友WKY是一名神奇的少年,在同龄人之中有着极高的地位... 题目描述 他的老师老王对他的程序水平赞叹不已,于是下决心培养这名小子. 老王的训练方式很奇怪,他会一口气让WKY做很多道题 ...

  9. postman本地测试post接口

    操作步骤 1.选择post请求 2.把url写到这里(一般测本地的话localhost:8080/项目名/xx/xx) 3.设置请求头 4.设置请求参数 5.发送请求 6.接收响应数据(json.ht ...

  10. 剑指offer.在O(1)时间内删除链表节点

    给定单向链表的一个节点指针,定义一个函数在O(1)时间删除该结点.假设链表一定存在,并且该节点一定不是尾节点. 样例 输入:链表 1->4->6->8 删掉节点:第2个节点即6(头节 ...