Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

  • LSTM units are complex and inherently sequential across time.
  • Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

  • The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

    

  • The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:

  • Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:

  • Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
  • Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
  • An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

  1. Paper Read: Convolutional Image Captioning

    Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...

  2. Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★

    Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...

  3. Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★

    Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...

  4. Paper Reading: Stereo DSO

    开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...

  5. 爬取CVPR 2018过程中遇到的坑

    爬取 CVPR 2018 过程中遇到的坑 使用语言及模块 语言: Python 3.6.6 模块: re requests lxml bs4 过程 一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...

  6. 在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境

    这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...

  7. Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )

    Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...

  8. Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning

    Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...

  9. Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★

    Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...

随机推荐

  1. 安装MySQL8.0.13

    引用于:CrazyDemo,博客地址:http://www.cnblogs.com/CrazyDemo 下载地址: https://www.mysql.com/downloads/ 现在最下边的社区版 ...

  2. Swift_类和结构体

    Swift_类和结构体 点击查看源码 struct Resolution { var width = 0 var height = 0 } class VideoMode { var resoluti ...

  3. 『ACM C++』 Codeforces | 1005D - Polycarp and Div 3

    今天佛了,魔鬼周一,在线教学,有点小累,但还好,今天AC了一道,每日一道,还好达成目标,还以为今天完不成了,最近任务越来越多,如何高效完成该好好思考一下了~最重要的还是学业的复习和预习. 今日兴趣新闻 ...

  4. java servlet数据库查询并将数据显示到jsp页面

    需要的jar包:mysql-connector-java.jar build path只是个jar包的引用,部署的时候想不丢包最好还是手动拷贝到对应项目的lib文件下. 在try{}中定义的变量为局部 ...

  5. 常用模块 - datetime模块

    一.简介 datetime是Python处理日期和时间的标准库. 1.datetime模块中常用的类: 类名 功能说明 date 日期对象,常用的属性有year, month, day time 时间 ...

  6. 08JavaScript对象

    JavaScript 对象是拥有属性和方法的数据. 注:在 JavaScript 中,对象是非常重要的,当你理解了对象,就可以了解 JavaScript . 1.JavaScript 对象 在 Jav ...

  7. mysql的length与char_length的区别

    length:   是计算字段的长度一个汉字是算三个字符,一个数字或字母算一个字符 char_length:不管汉字还是数字或者是字母都算是一个字符 同时这两个函数,可用于判断数据中是否有中文文字 例 ...

  8. 帝国cms教程父栏目和子栏目都能在当前栏目高亮

    首先在/e/class/userfun.php这个文件里面加上下面代码.上面父栏目的,下面子栏目的.红色代表css样式.自定义吧 function currentPage($classid,$this ...

  9. hadoop生态搭建(3节点)-16.elk配置

    # ==================================================================ELK环境准备 # 修改文件限制 # * 代表Linux所有用户 ...

  10. 如何在HHDI中进行数据质量探查并获取数据剖析报告

    通过执行多种数据剖析规则,对目标表(或一段SQL语句)进行数据质量探查,从而得到其数据质量情况.目前支持以下几种数据剖析类型,分别是:数字值分析.值匹配检查.字符值分析.日期值分析.布尔值分析.重复值 ...