Main Contributions:

  1. A brief introduction about two different methods (retrieval based method and generative method) for image captioning task.
  2. The authors implemented the classical model, Show and Tell, and gave analyses based on the experiments.

Excerpts:

  1. To achieve this goal, Show & Tell model is created by hybridizing two different models. It takes the image as input and provides it into Inception-v3 model. At the end of Inception-v3 model, a single fully connected layer is added. This layer will transform the output of Inception-v3 model into a word embedding vector. We input this word embedding vector into series of LSTM cells.
  2. For any given caption, we add two additional symbols as the start word and stop word. Whenever the stop word is encounted, it stops generating the sentence and it marks end of the string.
  3. Show & Tell model uses Beam Search to find suitable words to generate captions.

[Paper Reading] Image Captioning using Deep Neural Architectures (arXiv: 1801.05568v1)的更多相关文章

  1. Paper Reading - Show and Tell: A Neural Image Caption Generator ( CVPR 2015 )

    Link of the Paper: https://arxiv.org/abs/1411.4555 Main Points: A generative model ( NIC, GoogLeNet ...

  2. Paper Reading - Show, Attend and Tell: Neural Image Caption Generation with Visual Attention ( ICML 2015 )

    Link of the Paper: https://arxiv.org/pdf/1502.03044.pdf Main Points: Encoder-Decoder Framework: Enco ...

  3. [Paper Reading] Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

    论文链接:https://arxiv.org/pdf/1502.03044.pdf 代码链接:https://github.com/kelvinxu/arctic-captions & htt ...

  4. [Paper Reading] Show and Tell: A Neural Image Caption Generator

    论文链接:https://arxiv.org/pdf/1411.4555.pdf 代码链接:https://github.com/karpathy/neuraltalk & https://g ...

  5. Training Deep Neural Networks

    http://handong1587.github.io/deep_learning/2015/10/09/training-dnn.html  //转载于 Training Deep Neural ...

  6. Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

    目录 概 主要内容 Mustafa A., Khan S., Hayat M., Goecke R., Shen J., Shao L., Adversarial Defense by Restric ...

  7. Paper Reading:Deep Neural Networks for YouTube Recommendations

    论文:Deep Neural Networks for YouTube Recommendations 发表时间:2016 发表作者:(Google)Paul Covington, Jay Adams ...

  8. 为什么深度神经网络难以训练Why are deep neural networks hard to train?

    Imagine you're an engineer who has been asked to design a computer from scratch. One day you're work ...

  9. [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

    About this Course This course will teach you the "magic" of getting deep learning to work ...

随机推荐

  1. [USACO08NOV]Cheering up the Cow

    嘟嘟嘟 这道题删完边后是一棵树,那么一定和最小生成树有关啦. 考虑最后的生成树,无论从哪一个点出发,每一条边都会访问两次,而且在看一看样例,会发现走第w条边(u, v)的代价是L[w] * 2 + c ...

  2. 10分钟安装OpenStack

    1 OpenStack初学者的苦恼 2 OpenStack最低配置要求 3 配置UOS环境 3.1 设置网络 3.1.1 创建路由器 3.1.2 创建网络 3.1.3 创建两个子网 3.2 创建UOS ...

  3. centos6.5添加阿里docker加速器

    1. 配置阿里docker加速器 vi /etc/sysconfig/docker 在文件末尾追加下面两行 other_args="--registry-mirror=https://pl8 ...

  4. 适合自己的adblock过滤列表

    轻微完美主义,极简主义 已屏蔽广告: 1.CSDN的广告 2.百度侧栏热点搜索 3. 知乎广告 4.stackoverflow的推送广告 5.LeetCode的推送的是否见过这个题 bbs.csdn. ...

  5. CSS 学习路线(一)元素

    元素(element) 类型:替换和非替换元素 替换元素(replaced element): 用来替换元素内容的部分并非由文档内容直接显示. eg:img input 非替换元素(nonreplac ...

  6. layedit富文本编辑器获取纯文字内容和全部内容

  7. PDO访问方式操作数据库

    mysqli是专门访问MySQL数据库的,不能访问其它数据库.PDO可以访问多种的数据库,它把操作类合并在一起,做成一个数据访问抽象层,这个抽象层就是PDO,根据类操作对应的数据库.mysqli是一个 ...

  8. 4-[HTML]-body常用标签1

    <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...

  9. 3931: [CQOI2015]网络吞吐量

    3931: [CQOI2015]网络吞吐量 链接 分析: 跑一遍dijkstra,加入可以存在于最短路中的点,拆点最大流. 代码: #include<cstdio> #include< ...

  10. set方法在set传入值时报空指针异常,直接设置定值即可

    这种情况可能跟上下的程序有关,所以直接设置定值传入即可. 例如: re.setRid(ar.getRid()); // 这个是报错代码 md.setConnMailStatusTrue(ar.getR ...