Paper Reading - Convolutional Image Captioning ( CVPR 2018 )
Link of the Paper: https://arxiv.org/abs/1711.09151
Motivation:
- LSTM units are complex and inherently sequential across time.
- Convolutional networks have shown advantages on machine translation and conditional image generation.
Innovation:
- The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

- The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:
- Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:
- Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
- Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
- An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章
- Paper Read: Convolutional Image Captioning
Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...
- Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...
- Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
- Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...
- 爬取CVPR 2018过程中遇到的坑
爬取 CVPR 2018 过程中遇到的坑 使用语言及模块 语言: Python 3.6.6 模块: re requests lxml bs4 过程 一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...
- 在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境
这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...
- Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...
- Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...
- Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...
随机推荐
- Oracle中按规定的字符截取字符串
CREATE OR REPLACE FUNCTION "F_SPLIT" (p_str IN CLOB, p_delimiter IN VARCHAR2) RETURN ty_st ...
- 基于Bootstrap Ace模板+bootstrap.addtabs.js的菜单
这几天研究了基于bootstrap Ace模板+bootstra.addtabs.js实现菜单的效果 参考了这个人的博客 https://www.cnblogs.com/landeanfen/p/76 ...
- Spring总结以及在面试中的一些问题
Spring总结以及在面试中的一些问题. 1.谈谈你对spring IOC和DI的理解,它们有什么区别? IoC Inverse of Control 反转控制的概念,就是将原本在程序中手动创建Use ...
- STM32(13)——SPI
简介: SPI,Serial Peripheral interface串行外围设备接口. 接口应用在:EEPROM, FLASH,实时时钟,AD 转换器,还有数字信号处理器和数字信号解码器之间. 特点 ...
- Python(ATM机low版)
import osclass ATM: @staticmethod def regst(): while 1: nm = input('请输入你的名字:') mm = input('请输入你的密码:' ...
- 单片机,struct ,union定义标志,节约RAM
单片机的RAM是非常少的,像新唐,STC,合泰等一些国产的51单片机,RAM 512 byte,1k,2k,非常常见, 有时候我们的串口接收一串数据,或AD连续采集,这些数据是不能放到 flash 里 ...
- NUCLEO-L053R8 RCC时钟树 MCO输出
RCC时钟配置实验 最近玩了一下Nucleo-L053R8板子,即STM32L053R8T6.浏览了RCC章节后,顺便做了个小实验,现在给大伙分享一下. 实验非常简单,配置一下系统时钟,可以通过肉眼观 ...
- ACM数论-快速幂
ACM数论——快速幂 快速幂定义: 顾名思义,快速幂就是快速算底数的n次幂.其时间复杂度为 O(log₂N), 与朴素的O(N)相比效率有了极大的提高. 原理: 以下以求a的b次方来介绍: 把b转换成 ...
- 笔记-jinja2语法
笔记-jinja2语法 1. 基本语法 控制结构 {% %} 变量取值 {{ }} 注释 {# #} 2. 变量 最常用的是变量,由Flask渲染模板时传过来,比如上例中的”nam ...
- 笔记-sql语句
笔记-sql语句 1. sql语句基础 虽然经常使用sql语句,但没有一个整体式的文档,整理了一下. 1.1. select foundation: select <colnum ...