Link of the Paper: http://papers.nips.cc/paper/4470-im2text-describing-images-using-1-million-captioned-photographs.pdf

Main Points:

  1. A large novel data set containing images from the web with associated captions written by people, filtered so that the descriptions are likely to refer to visual content.
  2. A description generation method that utilizes global image representations to retrieve and transfer captions from their data set to a query image: authors achieve this by computing the global similarity ( a sum of gist similarity and tiny image color similarity ) of a query image to their large web-collection of captioned images; they find the closest matching image ( or images ) and simply transfer over the description from the matching image to the query image.
  3. A description generation method that utilizes both global representations and direct estimates of image content (objects, actions, stuff, attributes, and scenes) to produce relevant image descriptions.

Other Key Points:

  1. Image captioning will help advance progress toward more complex human recognition goals, such as how to tell the story behind an image.
  2. An approach from Every picture tells a story: generating sentences for images produces image descriptions via a retrieval method, by translating both images and text descriptions to a shared meaning space represented by a single < object, action, scene > tuple. A description for a query image is produced by retrieving whole image descriptions via this meaning space from a set of image descriptions.
  3. The retrieval method relies on collecting and filtering a large data set of images from the internet to produce a novel web-scale captioned photo collection.

Paper Reading - Im2Text: Describing Images Using 1 Million Captioned Photographs ( NIPS 2011 )的更多相关文章

  1. Paper Reading: Stereo DSO

    开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...

  2. Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★

    Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...

  3. [Paper Reading]--Exploiting Relevance Feedback in Knowledge Graph

    <Exploiting Relevance Feedback in Knowledge Graph> Publication: KDD 2015 Authors: Yu Su, Sheng ...

  4. Paper Reading: Perceptual Generative Adversarial Networks for Small Object Detection

    Perceptual Generative Adversarial Networks for Small Object Detection 2017-07-11  19:47:46   CVPR 20 ...

  5. Paper Reading: In Defense of the Triplet Loss for Person Re-Identification

    In Defense of the Triplet Loss for Person Re-Identification  2017-07-02  14:04:20   This blog comes ...

  6. Paper Reading - Attention Is All You Need ( NIPS 2017 ) ★

    Link of the Paper: https://arxiv.org/abs/1706.03762 Motivation: The inherently sequential nature of ...

  7. Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★

    Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...

  8. Paper Reading - Deep Visual-Semantic Alignments for Generating Image Descriptions ( CVPR 2015 )

    Link of the Paper: https://arxiv.org/abs/1412.2306 Main Points: An Alignment Model: Convolutional Ne ...

  9. Paper Reading - Mind’s Eye: A Recurrent Visual Representation for Image Caption Generation ( CVPR 2015 )

    Link of the Paper: https://ieeexplore.ieee.org/document/7298856/ A Correlative Paper: Learning a Rec ...

随机推荐

  1. Beyond Compare 命令行生成目录下所有文件比对的Html网页report

    MAC环境下,使用Beyond Compare命令行生成两个文件夹差异的html,按目录递归生成. #1. 创建compare #2. 创建compare/old #3. compare/new #4 ...

  2. 容易忽略的expect脚本问题,暗藏的僵尸进程,wait命令不要漏掉

    问题描述 前几天有个小需求,用到expect脚本去循环的发送一些数据,主要问题代码如下: #! /usr/bin/expect while {true} { set timeout 60 spawn ...

  3. Token ,Cookie和Session的区别

    在做接口测试时,经常会碰到请求参数为token的类型,但是可能大部分测试人员对token,cookie,session的区别还是一知半解. Cookie cookie 是一个非常具体的东西,指的就是浏 ...

  4. Kafka 部署指南-好久没有更新博客了

    最近到了一家新公司,很多全新技术栈要理解.每天都在看各类 English Offcial Document,我的宗旨是我既然看懂了,就写下来分享,这是第一篇. 基本需求: 1.已有 zookeeper ...

  5. Extjs 中callParent的作用

    callParent 是 Sencha 类系统提供的一个用于调用你父/祖先类中的方法. 这个通常用于当你 继承一个框架类 或者 覆写一个类中提供的方法(比如 onRender) 时. 当你在一个带参数 ...

  6. 前端提交表单两种验证方式记录 jq或h5 required属性

    JQuery: <form id="form"> <input type="text" name="aaa"> &l ...

  7. Redis的特性以及优势(附官网)

    NoSQL:一类新出现的数据库(not only sql) 泛指非关系型的数据库 不支持SQL语法 存储结构跟传统关系型数据库中的那种关系表完全不同,nosql中存储的数据都是KV形式 NoSQL的世 ...

  8. 使用jdk生成自签发证书(过程总结)

    前言: 最近在做华为NB-IoT接口开发,需要用到双向认证,就去学了一下. 然后我将过程总结了一下. 相关华为论坛链接:http://developer.huawei.com/ict/forum/th ...

  9. pow函数(数学次方)在c语言的用法,两种编写方法实例( 计算1/1-1/2+1/3-1/4+1/5 …… + 1/99 - 1/100 的值)

    关于c语言里面pow函数,下面借鉴了某位博主的一篇文章: 头文件:#include <math.h> pow() 函数用来求 x 的 y 次幂(次方),x.y及函数值都是double型 , ...

  10. IDL返回众数(数组中出现次数最多的值)

    对于整型数组,可以直接利用histogram函数可以实现,示例如下: IDL>array = [1, 1, 2 , 4, 1, 3, 3, 2, 4, 5, 3, 2, 2, 1, 2, 6, ...