Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

LSTM units are complex and inherently sequential across time.
Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:

Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:

Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

Paper Read: Convolutional Image Captioning
Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...
Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...
Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...
爬取CVPR 2018过程中遇到的坑
爬取 CVPR 2018 过程中遇到的坑使用语言及模块语言: Python 3.6.6 模块: re requests lxml bs4 过程一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...
在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境
这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...
Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...
Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...
Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...

随机推荐

Python 学习笔记（九）Python元组和字典（一）
Python 元组元组的定义元组(tuple)是一种Python对象类型,元组也是一种序列 Python中的元组与列表类似,不同之处元组的元素不能修改元组使用小括号,列表使用方括号元组的创建 ...
hashMap 和 linkedHashMap 的区别和联系
直接举例说明. 运行如下例子程序 mport java.util.HashMap; import java.util.Iterator; import java.util.LinkedHashMap; ...
kafka topic制定规则
kafka topic的制定,我们要考虑的问题有很多,比如生产环境中用几备份.partition数目多少合适.用几台机器支撑数据量,这些方面如何去考量?笔者根据实际的维护经验,写一些思考,希望大家指正 ...
原生 JS 实现扫雷 (分析+代码实现)
阅读这篇文章需要掌握的基础知识:Html5.CSS.JavaScript 在线Demo:查看扫雷规则在写扫雷之前,我们先了解下它的游戏规则 ● 扫雷是一个矩阵,地雷随机分布在方格上. ● 方格上的 ...
JavaScript常用DOM操作方法和函数
查找节点ocument.querySelector(selectors) //接受一个CSS选择器作为参数,返回第一个匹配该选择器的元素节点.document.querySelectorAll(sel ...
js实现点击按钮可实现编辑
<script type="text/javascript">//修改密码//抓取到的数据 function edit() { document.getElementB ...
爬虫——Scrapy框架案例二：阳光问政平台
阳光热线问政平台 URL地址:http://wz.sun0769.com/index.php/question/questionType?type=4&page= 爬取字段:帖子的编号.投诉类 ...
『Linux基础 - 5 』Linux常用命令(2)
这篇笔记的只要知识点: (1)ls查看文件信息,列表中每个字符所代表的含义 (2) 使用通配符匹配文件 (3) chmod命令:修改文件或目录权限 (4) 与用户相关命令(who.su.exit.pa ...
VXLAN简介（摘抄）
VXLAN简介 VXLAN:Virtual eXtensible Local Area Network的缩写,虚拟扩展局域网,现代数据中心的的一种网络虚拟化技术,即在传统的三层IP网络上虚拟出一张二层 ...
C语言实现 "谁是凶手?"
日本某地发生了一件谋杀案,警察通过排查确定杀人凶手必为4个嫌疑犯的一个.以下为4个嫌疑犯的供词.A说:不是我. a=0B说:是C. c=1 C说:是D. d=1D说:C在胡说 ...

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

随机推荐

热门专题