Main Contributions:

  1. A brief introduction about two different methods (retrieval based method and generative method) for image captioning task.
  2. The authors implemented the classical model, Show and Tell, and gave analyses based on the experiments.

Excerpts:

  1. To achieve this goal, Show & Tell model is created by hybridizing two different models. It takes the image as input and provides it into Inception-v3 model. At the end of Inception-v3 model, a single fully connected layer is added. This layer will transform the output of Inception-v3 model into a word embedding vector. We input this word embedding vector into series of LSTM cells.
  2. For any given caption, we add two additional symbols as the start word and stop word. Whenever the stop word is encounted, it stops generating the sentence and it marks end of the string.
  3. Show & Tell model uses Beam Search to find suitable words to generate captions.

[Paper Reading] Image Captioning using Deep Neural Architectures (arXiv: 1801.05568v1)的更多相关文章

  1. Paper Reading - Show and Tell: A Neural Image Caption Generator ( CVPR 2015 )

    Link of the Paper: https://arxiv.org/abs/1411.4555 Main Points: A generative model ( NIC, GoogLeNet ...

  2. Paper Reading - Show, Attend and Tell: Neural Image Caption Generation with Visual Attention ( ICML 2015 )

    Link of the Paper: https://arxiv.org/pdf/1502.03044.pdf Main Points: Encoder-Decoder Framework: Enco ...

  3. [Paper Reading] Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

    论文链接:https://arxiv.org/pdf/1502.03044.pdf 代码链接:https://github.com/kelvinxu/arctic-captions & htt ...

  4. [Paper Reading] Show and Tell: A Neural Image Caption Generator

    论文链接:https://arxiv.org/pdf/1411.4555.pdf 代码链接:https://github.com/karpathy/neuraltalk & https://g ...

  5. Training Deep Neural Networks

    http://handong1587.github.io/deep_learning/2015/10/09/training-dnn.html  //转载于 Training Deep Neural ...

  6. Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

    目录 概 主要内容 Mustafa A., Khan S., Hayat M., Goecke R., Shen J., Shao L., Adversarial Defense by Restric ...

  7. Paper Reading:Deep Neural Networks for YouTube Recommendations

    论文:Deep Neural Networks for YouTube Recommendations 发表时间:2016 发表作者:(Google)Paul Covington, Jay Adams ...

  8. 为什么深度神经网络难以训练Why are deep neural networks hard to train?

    Imagine you're an engineer who has been asked to design a computer from scratch. One day you're work ...

  9. [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

    About this Course This course will teach you the "magic" of getting deep learning to work ...

随机推荐

  1. [JLOI2009]二叉树问题

    嘟嘟嘟 对于求深度和宽度都很好维护.深度dfs时维护就行,宽度统计同一个深度的节点有多少个,然后取max. 对于求距离,我刚开始以为是要走到根节点在回来,然后固输了(dep[u] - 1) * 2 + ...

  2. Zookeeper入门(一)之概述

    今天主要讲这么几个方面? 1.分布式应用: 2.什么是Zookeeper: 3.使用Zookkeeper有什么好处: ZooKeeper是一种分布式协调服务,用于管理大型主机.在分布式环境中协调和管理 ...

  3. node-webkit,nwjs 打包启动启动很慢解决办法

    要开发一个桌面程序,可选择的有nwjs和electron,但是electron不支持xp,客户还是有一部分系统是用xp的,只能用nwjs. 由于程序需要安装很多npm的模块,node_module文件 ...

  4. ICC Scenario Definition

    现代先进工艺下的后端设计都是在 MCMM 情况下设计的,所谓 MCMM 就是 muti-corner  muti-mode,用于芯片的不同工作模式和工作条件. 后端设计过程中,需要保证芯片在所有工作模 ...

  5. tcpdump 和 wireshark 的实用例子

    tcpdump: 1.用 tcpdump 截取本机 ip 10.2.1.2 10050 端口的包 tcpdump -nnv  -i eth0 host 10.2.1.2 and port 10050 ...

  6. AWVS使用手册

    目录: 0×00.什么是Acunetix Web Vulnarability Scanner ( What is AWVS?) 0×01.AWVS安装过程.主要文件介绍.界面简介.主要操作区域简介(I ...

  7. logistic softmax

    sigmoid函数(也叫逻辑斯谛函数):  引用wiki百科的定义: A logistic function or logistic curve is a common “S” shape (sigm ...

  8. redis集群搭建+lua脚本的使用

    详细参考这篇文章(windows) https://blog.csdn.net/qiuyufeng/article/details/70474001 一.使用JAVA代码操作redis集群 publi ...

  9. Keepalived高可用集群

    一.服务介绍 keepalive起初是专为LVS设计的,专门用来监控LVS集群系统红各个服务节点的状态,后来又加入了VRRP的功能,因此不了配合LVS服务外,也可以作为其他服务(nginx,hapro ...

  10. java程序性能优化读书笔记-垃圾回收

    衡量系统性能的点 执行速度:即响应时间 内存分配:内存分配是否合理,是否过多消耗内存或者存在内存泄露 启动时间:程序从启动到正常处理业务需要的时间 负载承受能力:当系统压力上升,系统执行速度和响应时间 ...