Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

  • LSTM units are complex and inherently sequential across time.
  • Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

  • The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

    

  • The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:

  • Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:

  • Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
  • Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
  • An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

  1. Paper Read: Convolutional Image Captioning

    Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...

  2. Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★

    Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...

  3. Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★

    Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...

  4. Paper Reading: Stereo DSO

    开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...

  5. 爬取CVPR 2018过程中遇到的坑

    爬取 CVPR 2018 过程中遇到的坑 使用语言及模块 语言: Python 3.6.6 模块: re requests lxml bs4 过程 一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...

  6. 在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境

    这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...

  7. Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )

    Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...

  8. Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning

    Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...

  9. Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★

    Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...

随机推荐

  1. Oracle中按规定的字符截取字符串

    CREATE OR REPLACE FUNCTION "F_SPLIT" (p_str IN CLOB, p_delimiter IN VARCHAR2) RETURN ty_st ...

  2. 基于Bootstrap Ace模板+bootstrap.addtabs.js的菜单

    这几天研究了基于bootstrap Ace模板+bootstra.addtabs.js实现菜单的效果 参考了这个人的博客 https://www.cnblogs.com/landeanfen/p/76 ...

  3. Spring总结以及在面试中的一些问题

    Spring总结以及在面试中的一些问题. 1.谈谈你对spring IOC和DI的理解,它们有什么区别? IoC Inverse of Control 反转控制的概念,就是将原本在程序中手动创建Use ...

  4. STM32(13)——SPI

    简介: SPI,Serial Peripheral interface串行外围设备接口. 接口应用在:EEPROM, FLASH,实时时钟,AD 转换器,还有数字信号处理器和数字信号解码器之间. 特点 ...

  5. Python(ATM机low版)

    import osclass ATM: @staticmethod def regst(): while 1: nm = input('请输入你的名字:') mm = input('请输入你的密码:' ...

  6. 单片机,struct ,union定义标志,节约RAM

    单片机的RAM是非常少的,像新唐,STC,合泰等一些国产的51单片机,RAM 512 byte,1k,2k,非常常见, 有时候我们的串口接收一串数据,或AD连续采集,这些数据是不能放到 flash 里 ...

  7. NUCLEO-L053R8 RCC时钟树 MCO输出

    RCC时钟配置实验 最近玩了一下Nucleo-L053R8板子,即STM32L053R8T6.浏览了RCC章节后,顺便做了个小实验,现在给大伙分享一下. 实验非常简单,配置一下系统时钟,可以通过肉眼观 ...

  8. ACM数论-快速幂

    ACM数论——快速幂 快速幂定义: 顾名思义,快速幂就是快速算底数的n次幂.其时间复杂度为 O(log₂N), 与朴素的O(N)相比效率有了极大的提高. 原理: 以下以求a的b次方来介绍: 把b转换成 ...

  9. 笔记-jinja2语法

    笔记-jinja2语法 1.      基本语法 控制结构 {% %} 变量取值 {{ }} 注释 {# #} 2.      变量 最常用的是变量,由Flask渲染模板时传过来,比如上例中的”nam ...

  10. 笔记-sql语句

    笔记-sql语句 1.      sql语句基础 虽然经常使用sql语句,但没有一个整体式的文档,整理了一下. 1.1.    select foundation: select <colnum ...