Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

2018-07-27 14:25:26

Paper：https://arxiv.org/pdf/1807.06233.pdf

Related Papers:

1. Infrared and visible image fusion methods and applications: A survey 　　Paper

2. Chenglong Li, Xiao Wang, Lei Zhang, Jin Tang, Hejun Wu, and Liang Lin. WELD: Weighted Low-rank Decomposition or Robust Grayscale-Thermal Foreground Detection. IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 27(4): 725-738, 2017. [Project pagewith Dataset and Code]

3. Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. RGB-T Object Tracking: Benchmark and Baseline.[arXiv] [Dataset: Google drive, Baidu cloud] [Project page]

本文针对多模态融合问题（Multi-modal），提出一种基于 gate 机制的融合策略，能够自适应的进行多模态信息的融合。作者将该方法用到了物体检测上，其大致流程图如下所示：

如上图所示，作者分别用两路 Network 来提取两个模态的特征。该网络是由标准的 VGG-16 和 8 extra convolutional layers 构成。另外，作者提出新的 GIF（Gated Information Fusion Network）网络进行多个模态之间信息的融合，以取得更好的结果。动机当然就是多个模态的信息，是互补的，但是有的信息帮助会更大，有的可能就质量比较差，功效比较小，于是就可以自适应的来融合，达到更好的效果。

Gated Information Fusion Network (GIF)：

如上图所示：

该 GIF 网络的输入是：已经提取的 CNN feature map，这里是 F1, F2. 然后，将这两个 feature 进行 concatenate，得到 $F_G$. 该网络包含两个部分：

1. information fusion network（图2，虚线框意外的部分）；

2. weight generation network （WG Network，即：图2，虚线处）；

Weight Generation Network 分别用两个 3*3*1 的卷积核对组合后的 feature map $F_G$ 进行操作，然后输入到 sigmoid 函数中，即：gate layer，然后输出对应的权重 $w_1$，$w_2$。

Information fusion network 分别用得到的两个权重，点乘原始的 feature map，得到加权以后的特征图，将两者进行 concatenate 后，用 1*1*2k 的卷积核，得到最终的 feature map。

总结整个过程，可以归纳为：

== Done !

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network的更多相关文章

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks
目录概主要内容深度宽度代码 Huang H., Wang Y., Erfani S., Gu Q., Bailey J. and Ma X. Exploring architectural ...
【论文简读】 Deep web data extraction based on visual
<Deep web data extraction based on visual information processing>作者 J Liu 上海海事大学 2017 AIHC会议登载 ...
Paper List ABOUT Deep Learning
Deep Learning 方向的部分 Paper ,自用.一 RNN 1 Recurrent neural network based language model RNN用在语言模型上的开山之作 ...
【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统：调查与新视角
[论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys ...
[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase
Deep Reinforcement Learning Based Trading Application at JP Morgan Chase https://medium.com/@ranko.m ...
论文笔记: Deep Learning based Recommender System: A Survey and New Perspectives
(聊两句,突然记起来以前一个学长说的看论文要能够把论文的亮点挖掘出来,合理的进行概括23333) 传统的推荐系统方法获取的user-item关系并不能获取其中非线性以及非平凡的信息,获取非线性以及非平 ...
Predicting effects of noncoding variants with deep learning–based sequence model | 基于深度学习的序列模型预测非编码区变异的影响
Predicting effects of noncoding variants with deep learning–based sequence model PDF Interpreting no ...
论文翻译：2021_Towards model compression for deep learning based speech enhancement
论文地址:面向基于深度学习的语音增强模型压缩论文代码:没开源,鼓励大家去向作者要呀,作者是中国人,在语音增强领域深耕多年引用格式:Tan K, Wang D L. Towards model c ...
Deep High-Resolution Representation Learning for Human Pose Estimation
Deep High-Resolution Representation Learning for Human Pose Estimation 2019-08-30 22:05:59 Paper: CV ...

随机推荐

JVM调优总结 -Xms
转载自:http://unixboy.iteye.com/blog/174173/ 设置eclipse : -Xms256M -Xmx512M -XX:PermSize=256m -XX:MaxPer ...
redis 性能建议
因为 RDB 文件只用作后备用途,建议只在 Slave 上持久化 RDB 文件,而且只要15分钟一次就够了,只保留 save 900 1 这条规则. 如果 Enable AOF,好处是在恶劣请看下也只 ...
利用可排序Key-Value DB构建时间序列数据库（简论）
为了防止无良网站的爬虫抓取文章,特此标识,转载请注明文章出处.LaplaceDemon/ShiJiaqi. http://www.cnblogs.com/shijiaqi1066/p/5855064. ...
设计模式之Facade（外观）（转）
Facade的定义: 为子系统中的一组接口提供一个一致的界面. Facade一个典型应用就是数据库JDBC的应用,如下例对数据库的操作: public class DBCompare { Connec ...
linux常用命令：ls命令
ls命令是linux下最常用的命令.ls命令就是list的缩写,缺省下ls用来打印出当前目录的清单,如果ls指定其他目录那么就会显示指定目录里的文件及文件夹清单. 通过ls 命令不仅可以查看linux ...
新浪微博 [异常问题] 414 Request-URL Too Large
新浪微博 [异常问题] 414 Request-URL Too Large 浏览器上打开新浪微博,或则日志是返回结果提示:414 Request-URL Too Large原因:因同IP访问微博页面过 ...
js函数常见的写法以及调用方法
写在前面:本文详细的介绍了5中js函数常见的写法以及调用的方法,平时看别人代码的时候总是看到各种不同风格的js函数的写法.不明不白的,找了点资料,做了个总结,需要的小伙伴可以看看,做个参考.1.常规写 ...
tomcat9 性能调优
官网最靠谱 tomcat 参数官网: http://tomcat.apache.org/tomcat-7.0-doc/config/http.html <Connector port=& ...
介绍python中运算符优先级
下面这个表给出Python的运算符优先级,从最低的优先级(最松散地结合)到最高的优先级(最紧密地结合).这意味着在一个表达式中,Python会首先计算表中较下面的运算符,然后在计算列在表上部的运算符. ...
注册页面的JSON响应方式详细分析（与前端页面交互方式之一）
控制器层需求分析: 访问路径:`/user/reg.do` //自己根据功能需求设定的请求参数:`username=xx&password=xx&&phone=xx& ...

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network的更多相关文章

随机推荐

热门专题