Table of contents

  1. Introduction

  2. Survey papers

  3. Benchmark datasets

  4. Fine-grained image recognition

    1. Fine-grained recognition by localization-classification subnetworks

    2. Fine-grained recognition by end-to-end feature encoding

    3. Fine-grained recognition with external information

      1. Fine-grained recognition with web data / auxiliary data

      2. Fine-grained recognition with multi-modality data

      3. Fine-grained recognition with humans in the loop

  5. Fine-grained image retrieval

    1. Unsupervised with pre-trained models

    2. Supervised with metric learning

  6. Fine-grained image generation

    1. Generating from fine-grained image distributions

    2. Generating from text descriptions

  7. Future directions of FGIA

    1. Automatic fine-grained models

    2. Fine-grained few shot learning

    3. Fine-grained hashing

    4. FGIA within more realistic settings

  8. Leaderboard

1. Introduction


This homepage lists some representative papers/codes/datasets all about deep learning based fine-grained image, including fine-grained image recognition, fine-grained image retrieval, fine-grained image generation, etc. If you have any questions, please feel free to leave message.

2. Survey papers


3. Benchmark datasets


Summary of popular fine-grained image datasets. Note that ‘‘BBox’’ indicates whether this dataset provides object bounding box supervisions. ‘‘Part anno.’’ means providing the key part localizations. ‘‘HRCHY’’ corresponds to hierarchical labels. ‘‘ATR’’ represents the attribute labels (e.g., wing color, male, female, etc). ‘‘Texts’’ indicates whether fine-grained text descriptions of images are supplied.

Dataset name Year Meta-class  images  categories BBox Part anno. HRCHY ATR Texts
Oxford flower 2008 Flowers 8,189 102        
CUB200 2011 Birds 11,788 200  
Stanford Dog 2011 Dogs 20,580 120        
Stanford Car 2013 Cars 16,185 196        
FGVC Aircraft 2013 Aircrafts 10,000 100      
Birdsnap 2014 Birds 49,829 500    
NABirds 2015 Birds 48,562 555      
DeepFashion 2016 Clothes 800,000 1,050    
Fru92 2017 Fruits 69,614 92        
Veg200 2017 Vegetable 91,117 200        
iNat2017 2017 Plants & Animals 859,000 5,089      
RPC 2019 Retail products 83,739 200      

4. Fine-grained image recognition


Fine-grained recognition by localization-classification subnetworks

Fine-grained recognition by end-to-end feature encoding

5. Fine-grained recognition with external information

Fine-grained recognition with web data / auxiliary data

Fine-grained recognition with multi-modality data

Fine-grained recognition with humans in the loop

5. Fine-grained image retrieval


Unsupervised with pre-trained models

Supervised with metric learning

6. Fine-grained image generation


Generating from fine-grained image distributions

Generating from text descriptions

7. Future directions of FGIA


Fine-grained few shot learning

FGIA within more realistic settings

8. Leaderboard


The section is being continually updated. Since CUB200-2011 is the most popularly used fine-grained dataset, we list the fine-grained recognition leaderboard by treating it as the test bed.

Method Publication BBox? Part? External information? Base model Image resolution Accuracy
PB R-CNN ECCV 2014       Alex-Net 224x224 73.9%
MaxEnt NIPS 2018       GoogLeNet TBD 74.4%
PB R-CNN ECCV 2014     Alex-Net 224x224 76.4%
PS-CNN CVPR 2016   CaffeNet 454x454 76.6%
MaxEnt NIPS 2018       VGG-16 TBD 77.0%
Mask-CNN PR 2018     Alex-Net 448x448 78.6%
PC ECCV 2018       ResNet-50 TBD 80.2%
DeepLAC CVPR 2015   Alex-Net 227x227 80.3%
MaxEnt NIPS 2018       ResNet-50 TBD 80.4%
Triplet-A CVPR 2016   Manual labour GoogLeNet TBD 80.7%
Multi-grained ICCV 2015     WordNet etc. VGG-19 224x224 81.7%
Krause et al. CVPR 2015     CaffeNet TBD 82.0%
Multi-grained ICCV 2015   WordNet etc. VGG-19 224x224 83.0%
TS CVPR 2016       VGGD+VGGM 448x448 84.0%
Bilinear CNN ICCV 2015       VGGD+VGGM 448x448 84.1%
STN NIPS 2015       GoogLeNet+BN 448x448 84.1%
LRBP CVPR 2017       VGG-16 224x224 84.2%
PDFS CVPR 2016       VGG-16 TBD 84.5%
Xu et al. ICCV 2015 Web data CaffeNet 224x224 84.6%
Cai et al. ICCV 2017       VGG-16 448x448 85.3%
RA-CNN CVPR 2017       VGG-19 448x448 85.3%
MaxEnt NIPS 2018       Bilinear CNN TBD 85.3%
PC ECCV 2018       Bilinear CNN TBD 85.6%
CVL CVPR 2017     Texts VGG TBD 85.6%
Mask-CNN PR 2018     VGG-16 448x448 85.7%
GP-256 ECCV 2018       VGG-16 448x448 85.8%
KP CVPR 2017       VGG-16 224x224 86.2%
T-CNN IJCAI 2018       ResNet 224x224 86.2%
MA-CNN ICCV 2017       VGG-19 448x448 86.5%
MaxEnt NIPS 2018       DenseNet-161 TBD 86.5%
DeepKSPD ECCV 2018       VGG-19 448x448 86.5%
OSME+MAMC ECCV 2018       ResNet-101 448x448 86.5%
StackDRL IJCAI 2018       VGG-19 224x224 86.6%
DFL-CNN CVPR 2018       VGG-16 448x448 86.7%
PC ECCV 2018       DenseNet-161 TBD 86.9%
KERL IJCAI 2018     Attributes VGG-16 224x224 87.0%
HBP ECCV 2018       VGG-16 448x448 87.1%
Mask-CNN PR 2018     ResNet-50 448x448 87.3%
DFL-CNN CVPR 2018       ResNet-50 448x448 87.4%
NTS-Net ECCV 2018       ResNet-50 448x448 87.5%
HSnet CVPR 2017   GoogLeNet+BN TBD 87.5%
MetaFGNet ECCV 2018     Auxiliary data ResNet-34 TBD 87.6%
DCL CVPR 2019       ResNet-50 448x448 87.8%
TASN CVPR 2019       ResNet-50 448x448 87.9%
Ge et al. CVPR 2019       GoogLeNet+BN Shorter side is 800 px 90.4%

Fine-Grained(细粒度) Image – Papers, Codes and Datasets的更多相关文章

  1. Matlab Codes and Datasets for Feature Learning

    Matlab Codes and Datasets for Feature Learning 浙江大学CAiDeng提供的Matlab特征学习Code.

  2. CVPR 2015 papers

    CVPR2015 Papers震撼来袭! CVPR 2015的文章可以下载了,如果链接无法下载,可以在Google上通过搜索paper名字下载(友情提示:可以使用filetype:pdf命令). Go ...

  3. KDD2015,Accepted Papers

    Accepted Papers by Session Research Session RT01: Social and Graphs 1Tuesday 10:20 am–12:00 pm | Lev ...

  4. HDFS 细粒度锁优化,FusionInsight MRS有妙招

    摘要:华为云FusionInsight MRS通过FGL对HDFS NameNode锁机制进行优化,有效提升了NameNode的读写吞吐量,从而能够支持更多数据,更多业务请求访问,从而更好的支撑政企客 ...

  5. cvpr2015papers

    @http://www-cs-faculty.stanford.edu/people/karpathy/cvpr2015papers/ CVPR 2015 papers (in nicer forma ...

  6. Official Program for CVPR 2015

    From:  http://www.pamitc.org/cvpr15/program.php Official Program for CVPR 2015 Monday, June 8 8:30am ...

  7. 2016CVPR论文集

    http://www.cv-foundation.org/openaccess/CVPR2016.py ORAL SESSION Image Captioning and Question Answe ...

  8. CVPR2016 Paper list

    CVPR2016 Paper list ORAL SESSIONImage Captioning and Question Answering Monday, June 27th, 9:00AM - ...

  9. Cryptographic method and system

    The present invention relates to the field of security of electronic data and/or communications. In ...

随机推荐

  1. linux查看cpu核数和内存指令

    # 总核数 = 物理CPU个数 X 每颗物理CPU的核数 # 总逻辑CPU数 = 物理CPU个数 X 每颗物理CPU的核数 X 超线程数 # 查看物理CPU个数 cat /proc/cpuinfo| ...

  2. Centeos7部署Flask+Gunicorn+nginx

    一.环境安装 pip3 install flask pip3 install gunicorn pip3 install nginx 二.模块介绍 1.Flask是一个使用 Python 编写的轻量级 ...

  3. Thinkphp5.0第五篇

    原样输出 使用literal标签防止模板标签被解析 例如 {literal} {$name}<br/> {/literal} 模板单行注释 {//注释内容} 多行注释 {/*注释内容*/} ...

  4. 设计模式---结构型模式之适配器模式(Adapter Pattern)

    适配器模式定义 将一个类的接口,转换成客户期望的另外一个接口.适配器让原本接口不兼容的类可以合作无间. 适配器模式主要有两种类型:对象适配器和类适配器. 在详细解释这两种类型时,解释部分重要角色.生活 ...

  5. 技术不错的Java程序员,为何面试却“屡战屡败”

    为何很多有不少编程经验,技术能力不错的程序员,去心仪公司面试时却总是失败?至于失败的原因,可能很多人都没意识到过. 01想要通关面试,千万别让数据结构拖了后腿 很多公司,比如 BAT.Google.F ...

  6. Oracle注入之带外通信

    Oracle注入之带外通信和DNSLOG注入非常相似,例如和mysql中load_file()函数实现无回显注入非常相似. 下面介绍这个技术中常用的函数和使用. 环境这里准备两台测试,一台注入点的靶机 ...

  7. 高精度运算略解 在struct中重载运算符

    高精度 高精度,即高精度算法,属于处理大数字的数学计算方法.在一般的科学计算中,会经常算到小数点后几百位或者更多,当然也可能是几千亿几百亿的大数字. 重载运算符 运算符重载,就是对已有的运算符重新进行 ...

  8. opencv::直方图计算

    直方图概念 上述直方图概念是基于图像像素值,其实对图像梯度.每个像素的角度.等一切图像的属性值,我们都可以建立直方图.        这个才是直方图的概念真正意义,不过是基于图像像素灰度直方图是最常见 ...

  9. TensorFlow2.0(9):TensorBoard可视化

    .caret, .dropup > .btn > .caret { border-top-color: #000 !important; } .label { border: 1px so ...

  10. 2.2 C语言_实现数据容器vector(排序功能)

    上一节我们说到我们己经实现了一般Vector可以做到的自动扩充,告诉随机存取,那么现在我们需要完成vector的一个排序的功能. 排序算法我们网上一百度哇~~!很常见的就有8大排序算法: 1.选择排序 ...