Main Contributions:

  1. A brief introduction about two different methods (retrieval based method and generative method) for image captioning task.
  2. The authors implemented the classical model, Show and Tell, and gave analyses based on the experiments.

Excerpts:

  1. To achieve this goal, Show & Tell model is created by hybridizing two different models. It takes the image as input and provides it into Inception-v3 model. At the end of Inception-v3 model, a single fully connected layer is added. This layer will transform the output of Inception-v3 model into a word embedding vector. We input this word embedding vector into series of LSTM cells.
  2. For any given caption, we add two additional symbols as the start word and stop word. Whenever the stop word is encounted, it stops generating the sentence and it marks end of the string.
  3. Show & Tell model uses Beam Search to find suitable words to generate captions.

[Paper Reading] Image Captioning using Deep Neural Architectures (arXiv: 1801.05568v1)的更多相关文章

  1. Paper Reading - Show and Tell: A Neural Image Caption Generator ( CVPR 2015 )

    Link of the Paper: https://arxiv.org/abs/1411.4555 Main Points: A generative model ( NIC, GoogLeNet ...

  2. Paper Reading - Show, Attend and Tell: Neural Image Caption Generation with Visual Attention ( ICML 2015 )

    Link of the Paper: https://arxiv.org/pdf/1502.03044.pdf Main Points: Encoder-Decoder Framework: Enco ...

  3. [Paper Reading] Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

    论文链接:https://arxiv.org/pdf/1502.03044.pdf 代码链接:https://github.com/kelvinxu/arctic-captions & htt ...

  4. [Paper Reading] Show and Tell: A Neural Image Caption Generator

    论文链接:https://arxiv.org/pdf/1411.4555.pdf 代码链接:https://github.com/karpathy/neuraltalk & https://g ...

  5. Training Deep Neural Networks

    http://handong1587.github.io/deep_learning/2015/10/09/training-dnn.html  //转载于 Training Deep Neural ...

  6. Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

    目录 概 主要内容 Mustafa A., Khan S., Hayat M., Goecke R., Shen J., Shao L., Adversarial Defense by Restric ...

  7. Paper Reading:Deep Neural Networks for YouTube Recommendations

    论文:Deep Neural Networks for YouTube Recommendations 发表时间:2016 发表作者:(Google)Paul Covington, Jay Adams ...

  8. 为什么深度神经网络难以训练Why are deep neural networks hard to train?

    Imagine you're an engineer who has been asked to design a computer from scratch. One day you're work ...

  9. [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

    About this Course This course will teach you the "magic" of getting deep learning to work ...

随机推荐

  1. 【转】iOS:AvPlayer设置播放速度不生效的解决办法

    现象: 项目有一个需求是实现视频的慢速播放,使用的是封装的AvPlayer,但是设置时发现比如设置rate为0.5,0.1,0.01都是一样的速度,非常疑惑.后来经过查找资料,发现iOS10对这个AP ...

  2. 改变文件上传input file类型的外观

    当我们使用文件上传功能时,<input type="file">,但是外观有点不符合口味,如何解决这个问题? <input type="file&quo ...

  3. python matplotlib quiver——画箭头、风场

    理解参考:https://blog.csdn.net/liuchengzimozigreat/article/details/84566650 以下实例 import numpy as np impo ...

  4. http协议中的keeplive是做什么的?它的适应场景是什么?

    1.Http底层也是通过TCP传输的. 2.HTTP keep-alive Http是一个”请求-响应”协议,它的keep-alive主要是为了让多个http请求共享一个Tcp连接,以避免每个Http ...

  5. 【js】插件—动效Velocity.js

    Velocity.js——加速JavaScript动画 一款替代jQuery的$ .animate()动效的插件.兼容IE8和Android2.3及以上. 相比较优点: 1.它比JQuery更快,并实 ...

  6. 3DES加解密 C语言

    3DES(或称为Triple DES),它相当于是对每个数据块应用三次DES加密算法.3*8字节密钥. 设Ek()和Dk()代表DES算法的加密和解密过程,K代表DES算法使用的密钥,P代表明文,C代 ...

  7. MacOS在Finder中建立快速新建txt的workflow

    Mac是不支持右键直接新建txt的,因此有时候需要用到文本文稿的时候会比较麻烦.   因此这里提供一种个人认为比较方便的方法,让Mac也能很简洁的新建txt文件.   工具介绍: Automator ...

  8. Git知多少!!!

    第一次写博客,内心有点小激动呀!首先祝大家圣诞快乐~~啦啦啦~~好了,我要步入正题啦!今天是上班第二周,终于开始写需求啦!开森~~撒花~~ 来这里第一个要学的就是git的操作啦!入职第一天发了一个大大 ...

  9. R语言数据结构二

    上节我们讲到R语言中的基本数据类型,包括数值型,复数型,字符型,逻辑型以及对应的操作和不同数值类型之间的转换.众所周知,R语言的优势在于进行数据挖掘,大数据处理等方面,因此单个的数据并不能满足我们的需 ...

  10. QuestaSim 中文注释乱码

    在QuestaSim按如下顺序打开对应窗口, Tools -> Edit Preferences -> By Name -> Find 输入 encoding搜索对应项,将其valu ...