Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

LSTM units are complex and inherently sequential across time.
Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:

Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:

Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

Paper Read: Convolutional Image Captioning
Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...
Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...
Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...
爬取CVPR 2018过程中遇到的坑
爬取 CVPR 2018 过程中遇到的坑使用语言及模块语言: Python 3.6.6 模块: re requests lxml bs4 过程一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...
在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境
这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...
Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...
Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...
Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...

随机推荐

window7及以上创建软链接 mklink
软链接是一种文件共享方式. 命令:mklink /d "C:\d" "C:\e" 有哪些坑: 1.此命名必须以管理员方式在cmd运行 2.文件必须不存在..通过 ...
How to Effectively crack .JAR Files?
Author: http://www.cnblogs.com/open-coder/p/3763170.html With some external tools, we could crack a ...
java web多组件协作实现用户登录验证
实现步骤: 1.创建用户登录提交界面 2.创建处理用户登录请求servlet组件Main 3.创建代表登录成功响应的servlet的组件LoginSuccess 4.创建代表登录失败响应的servle ...
关于vue中mockjs的使用
使用vue的时候,后台可能不能及时作出接口,那么就需要我们前端自己模拟数据,使用mockjs可以进行模拟数据. 首先安装mockjs,cnpm install mockjs --save-dev: 其 ...
nginx ssl pathinfo 伪静态 301 配置文件
server { listen ; root /www/web/test_com/public_html; server_name test.com test.com; if ($host != '* ...
laydate5.0 设置最大最小值
由于新版的laydate时间插件在初始化时已设置时间最大最小范围,且生成对象,无法重新渲染改变其日期最大最小值. 有网友经实验贴出如下方法可达成目的,故做记录. //开始时间 var startDat ...
谈谈php对象的依赖
通过构造函数的方法 <?php //定义一个类,后面的类依赖这个类里面的方法 class play { public function playing() { echo "I can ...
Hbase 表的Rowkey设计避免数据热点
一.案例分析常见避免数据热点问题的处理方式有:加盐.哈希.反转等方法结合预分区使用. 由于目前原数据第一字段为时间戳形式,第二字段为电话号码,直接存储容易引起热点问题,通过加随机列.组合时间戳.字段 ...
python中的__all__
在定义一个模块的时候,在开头处加上 “ __all__ = ["xxx1", "xxx2"] ”(xxx可以是方法.类.变量等希望让外界访问的值),那么在外部通 ...
STM32_2 简单分析startup函数
;******************** (C) COPYRIGHT STMicroelectronics ******************** ;* File Name : startup_s ...

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

随机推荐

热门专题