Paper Reading - Convolutional Image Captioning ( CVPR 2018 )
Link of the Paper: https://arxiv.org/abs/1711.09151
Motivation:
- LSTM units are complex and inherently sequential across time.
- Convolutional networks have shown advantages on machine translation and conditional image generation.
Innovation:
- The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

- The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:
- Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:
- Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
- Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
- An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章
- Paper Read: Convolutional Image Captioning
Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...
- Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...
- Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
- Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...
- 爬取CVPR 2018过程中遇到的坑
爬取 CVPR 2018 过程中遇到的坑 使用语言及模块 语言: Python 3.6.6 模块: re requests lxml bs4 过程 一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...
- 在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境
这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...
- Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...
- Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...
- Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...
随机推荐
- Matplotlib——初级
matplotlib是一个专门用来绘图的库,在分析数据的时候,使用它可以将数据进行可视化,更直观的呈现.下面是几个通过matplot绘制的图. 通过图形的绘制,我们可以很清晰地看到数据直接的关系,并对 ...
- css代码添加背景图片常用代码
css代码添加背景图片常用代码 1 背景颜色 { font-size: 16px; content: ""; display: block; width: 700px; heigh ...
- Jquery 操作 select 的操作指南
这里我们以一个简单的select作为原型来进行说明: <select> <option value="a1">香蕉1</option> < ...
- iOS之webview加载网页、文件、html的方法
UIWebView 是用来加载加载网页数据的一个框.UIWebView可以用来加载pdf.word.doc 等等文件 生成webview 有两种方法,1.通过storyboard 拖拽 2.通过a ...
- 关于mybatis编写sql问题.
最近呢楼主回到长沙进行面试:被问了一个这样的问题,在mybatis中怎么进行模糊查询,望各位大佬在下方进行评论,好让我这菜鸡多学习一些.
- mint-ui message box 问题;
当引用 mint-ui message box 的 出现的问题,我暂时是不知道为什么: 官网是这样写的: 于是 我也这么做的:(这里用小写,具体我也不清楚,毕竟文档上写的也不是很清楚,但是只有这样写, ...
- sqlserver 导出数据库表结构
https://www.cnblogs.com/miaomiaoquanfa/p/6909835.html SELECT 表名 = case when a.colorder=1 then d.name ...
- mongodb C++ Driver安装
前言 mongocxx官网地址 http://mongocxx.org/?jmp=docs 本文的安装版本是:mongocxx-r3.2.0.tar.gz . 参考文档安装过程http://mongo ...
- Hive命令行及参数配置
1 . Hive 命令行 输入$HIVE_HOME/bin/hive –H 或者 –help 可以显示帮助选项: 说明: 1. -i 初始化 HQL 文件. 2. -e 从命令行执行指定的 HQL ...
- Python-变量与基础数据类型
·变量(variable) 笔记: 变量本质上是一个占位符.变量可以用来存储整数.字符串.列表等.简单的可以理解为一个座位,可以坐老人也可以坐小孩,可以坐男孩,也可以坐女孩. 在python里,标识 ...