Fine-Grained(细粒度) Image – Papers, Codes and Datasets
Table of contents
Introduction
Survey papers
Benchmark datasets
Fine-grained image recognition
Fine-grained recognition by localization-classification subnetworks
Fine-grained recognition by end-to-end feature encoding
Fine-grained recognition with external information
Fine-grained recognition with web data / auxiliary data
Fine-grained recognition with multi-modality data
Fine-grained recognition with humans in the loop
Fine-grained image retrieval
Unsupervised with pre-trained models
Supervised with metric learning
Fine-grained image generation
Generating from fine-grained image distributions
Generating from text descriptions
Future directions of FGIA
Automatic fine-grained models
Fine-grained few shot learning
Fine-grained hashing
FGIA within more realistic settings
Leaderboard
1. Introduction
This homepage lists some representative papers/codes/datasets all about deep learning based fine-grained image, including fine-grained image recognition, fine-grained image retrieval, fine-grained image generation, etc. If you have any questions, please feel free to leave message.
2. Survey papers
A Survey on Deep Learning-based Fine-Grained Object Classification and Semantic Segmentation.
Bo Zhao, Jiashi Feng, Xiao Wu, and Shuicheng Yan. International Journal of Automation and Computing, 2017.
3. Benchmark datasets
Summary of popular fine-grained image datasets. Note that ‘‘BBox’’ indicates whether this dataset provides object bounding box supervisions. ‘‘Part anno.’’ means providing the key part localizations. ‘‘HRCHY’’ corresponds to hierarchical labels. ‘‘ATR’’ represents the attribute labels (e.g., wing color, male, female, etc). ‘‘Texts’’ indicates whether fine-grained text descriptions of images are supplied.
| Dataset name | Year | Meta-class | images |
categories |
BBox | Part anno. | HRCHY | ATR | Texts |
| Oxford flower | 2008 | Flowers | 8,189 | 102 | ![]() |
||||
| CUB200 | 2011 | Birds | 11,788 | 200 | ![]() |
![]() |
![]() |
![]() |
|
| Stanford Dog | 2011 | Dogs | 20,580 | 120 | ![]() |
||||
| Stanford Car | 2013 | Cars | 16,185 | 196 | ![]() |
||||
| FGVC Aircraft | 2013 | Aircrafts | 10,000 | 100 | ![]() |
![]() |
|||
| Birdsnap | 2014 | Birds | 49,829 | 500 | ![]() |
![]() |
![]() |
||
| NABirds | 2015 | Birds | 48,562 | 555 | ![]() |
![]() |
|||
| DeepFashion | 2016 | Clothes | 800,000 | 1,050 | ![]() |
![]() |
![]() |
||
| Fru92 | 2017 | Fruits | 69,614 | 92 | ![]() |
||||
| Veg200 | 2017 | Vegetable | 91,117 | 200 | ![]() |
||||
| iNat2017 | 2017 | Plants & Animals | 859,000 | 5,089 | ![]() |
![]() |
|||
| RPC | 2019 | Retail products | 83,739 | 200 | ![]() |
![]() |
4. Fine-grained image recognition
Fine-grained recognition by localization-classification subnetworks
Part-based R-CNNs for Fine-Grained Category Detection.
Ning Zhang, Jeff Donahue, Ross Girshick, and Trevor Darrell. ECCV, 2014. [code]
Interaction Part Mining: A Mid-Level Approach for Fine-Grained Action Recognition.
Yang Zhou, Bingbing Ni, Richang Hong, Meng Wang, and Qi Tian. CVPR, 2015.
Fine-Grained Recognition without Part Annotations.
Jonathan Krause, Hailin Jin, Jianchao Yang, and Li Fei-Fei. CVPR, 2015. [code]
The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification.
Tianjun Xiao, Yichong Xu, Kuiyuan Yang, Jiaxing Zhang, Yuxin Peng, and Zheng Zhang. CVPR, 2015.
Deep LAC: Deep Localization, Alignment and Classification for Fine-grained Recognition.
Di Lin, Xiaoyong Shen, Cewu Lu, and Jiaya Jia. CVPR, 2015.
Spatial Transformer Networks.
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. NIPS, 2015. [code]
Part-Stacked CNN for Fine-Grained Visual Categorization.
Shaoli Huang, Zhe Xu, Dacheng Tao, and Ya Zhang. CVPR, 2016.
Mining Discriminative Triplets of Patches for Fine-Grained Classification.
Yaming Wang, Jonghyun Choi, Vlad I. Morariu, and Larry S. Davis. CVPR, 2016.
SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-grained Recognition.
Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, and Dimitris Metaxas. CVPR, 2016.
Picking Deep Filter Responses for Fine-grained Image Recognition.
Xiaopeng Zhang, Hongkai, Xiong, Wengang Zhou, Weiyao Lin, and Qi Tian. CVPR, 2016.
Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition.
Jianlong Fu, Heliang Zheng, and Tao Mei. CVPR, 2017.
Fine-Grained Recognition as HSnet Search for Informative Image Parts.
Michael Lam, Behrooz Mahasseni, and Sinisa Todorovic. CVPR, 2017.
Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition.
Heliang Zheng, Jianlong Fu, Tao Mei, and Jiebo Luo. ICCV, 2017. [code]
Weakly Supervised Learning of Part Selection Model with Spatial Constraints for Fine-Grained Image Classification.
Xiangteng He, and Yuxin Peng. AAAI, 2017.
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition.
Xiao Liu, Jiang Wang, Shilei Wen, Errui Ding, and Yuanqing Lin. AAAI, 2017.
Learning to Navigate for Fine-grained Classification.
Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, and Liwei Wang. ECCV, 2018. [code]
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition.
Ming Sun, Yuchen Yuan, Feng Zhou, and Errui Ding. ECCV, 2018. [code]
Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up.
Weifeng Ge, Xiangru Lin, and Yizhou Yu. CVPR, 2019.
Fine-grained recognition by end-to-end feature encoding
Hyper-Class Augmented and Regularized Deep Learning for Fine-Grained Image Classification.
Saining Xie, Tianbao Yang, Xiaoyu Wang, and Yuanqing Lin. CVPR, 2015.
Subset Feature Learning for Fine-Grained Category Classification.
ZongYuan Ge, Christopher McCool, Conrad Sanderson, and Peter Corke. CVPR, 2015.
Bilinear CNN Models for Fine-grained Visual Recognition.
Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. ICCV, 2015. [code]
Multiple Granularity Descriptors for Fine-Grained Categorization.
Dequan Wang, Zhiqiang Shen, Jie Shao, Wei Zhang, Xiangyang Xue, and Zheng Zhang. ICCV, 2015.
Compact Bilinear Pooling.
Yang Gao, Oscar Beijbom, Ning Zhang, and Trevor Darrell. CVPR, 2016. [code]
Fine-Grained Image Classification by Exploring Bipartite-Graph Labels.
Feng Zhou, and Yuanqing Lin. CVPR, 2016. [project page]
Kernel Pooling for Convolutional Neural Networks.
Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, Yuanqing Lin, and Serge Belongie. CVPR, 2017.
Low-rank Bilinear Pooling for Fine-Grained Classification.
Shu Kong, and Charless Fowlkes. CVPR, 2017. [code]
Higher-order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization.
Sijia Cai, Wangmeng Zuo, and Lei Zhang. ICCV, 2017. [code]
Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition.
Yaming Wang, Vlad I. Morariu, and Larry S. Davis. CVPR, 2018. [code]
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization.
Peihua Li, Jiangtao Xie, Qilong Wang, and Zilin Gao. CVPR, 2018. [code]
Maximum-Entropy Fine Grained Classification.
Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, and Nikhil Naik. NIPS, 2018.
Pairwise Confusion for Fine-Grained Visual Classification.
Abhimanyu Dubey, Otkrist Gupta, Pei Guo, Ramesh Raskar, Ryan Farrell, and Nikhil Naik. ECCV, 2018. [code]
DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-Grained Image Recognition.
Melih Engin, Lei Wang, Luping Zhou, and Xinwang Liu. ECCV, 2018.
Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition.
Chaojian Yu, Xinyi Zhao, Qi Zheng, Peng Zhang, and Xinge You. ECCV, 2018. [code]
Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification.
Xing Wei, Yue Zhang, Yihong Gong, Jiawei Zhang, and Nanning Zheng. ECCV, 2018.
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition.
Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, and Jiebo Luo. CVPR, 2019. [code]
Destruction and Construction Learning for Fine-grained Image Recognition.
Yue Chen, Yalong Bai, Wei Zhang, and Tao Mei. CVPR, 2019. [code]
5. Fine-grained recognition with external information
Fine-grained recognition with web data / auxiliary data
Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization.
Zhe Xu, Shaoli Huang, Ya Zhang, and Dacheng Tao. ICCV, 2015.
Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification.
Li Niu, Ashok Veeraraghavan, and Vshu Sabbarwal. CVPR, 2018.
Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data.
Yabin Zhang, Hui Tang, and Kai Jia. ECCV, 2018. [code]
Learning from Web Data using Adversarial Discriminative Neural Networks for Fine-Grained Classification.
Xiaoxiao Sun, Liyi Chen, and Jufeng Yang. AAAI, 2019.
Fine-grained recognition with multi-modality data
Fine-Grained Image Classification via Combining Vision and Language.
Xiangteng He, and Yuxin Peng. CVPR, 2017.
Audio Visual Attribute Discovery for Fine-Grained Object Recognition.
Hua Zhang, Xiaochun Cao, and Rui Wang. AAAI, 2018.
Fine-grained Image Classification by Visual-Semantic Embedding.
Huapeng Xu, Guilin Qi, Jingjing Li, Meng Wang, Kang Xu, and Huan Gao. IJCAI, 2018.
Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition.
Tianshui Chen, Liang Lin, Riquan Chen, Yang Wu, and Xiannan Luo. IJCAI, 2018.
Fine-grained recognition with humans in the loop
Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop.
Yin Cui, Feng Zhou, Yuanqing Lin, and Serge Belongie. CVPR, 2016.
5. Fine-grained image retrieval
Unsupervised with pre-trained models
Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval.
Xiu-Shen Wei, Jian-Hao Luo, Jianxin Wu, and Zhi-Hua Zhou. TIP, 2017. [project page]
Supervised with metric learning
Centralized Ranking Loss with Weakly Supervised Localization for Fine-Grained Object Retrieval.
Xiawu Zheng, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Feiyue Huang, and Yanhua Yang. IJCAI, 2018.
Towards Optimal Fine Grained Retrieval via Decorrelated Centralized Loss with Normalize-Scale layer.
Xiawu Zheng, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Yongjian Wu, and Feiyue Huang. AAAI, 2019.
6. Fine-grained image generation
Generating from fine-grained image distributions
CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training.
Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, and Gang Hua. ICCV, 2017. [code]
FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery.
Krishna Kumar Singh, Utkarsh Ojha, and Yong Jae Lee. CVPR, 2019. [code]
Generating from text descriptions
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks.
Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He. CVPR, 2018. [code]
7. Future directions of FGIA
Fine-grained few shot learning
Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples.
Xiu-Shen Wei, Peng Wang, Lingqiao Liu, Chunhua Shen, and Jianxin Wu. TIP, 2019.
FGIA within more realistic settings
Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach.
Timnit Geru, Judy Hoffman, and Li Fei-Fei. ICCV, 2017.
The iNaturalist Species Classification and Detection Dataset.
Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. CVPR 2018.
RPC: A Large-Scale Retail Product Checkout Dataset.
Xiu-Shen Wei, Quan Cui, Lei Yang, Peng Wang, and Lingqiao Liu. arXiv: 1901.07249, 2019. [project page]
8. Leaderboard
The section is being continually updated. Since CUB200-2011 is the most popularly used fine-grained dataset, we list the fine-grained recognition leaderboard by treating it as the test bed.
| Method | Publication | BBox? | Part? | External information? | Base model | Image resolution | Accuracy |
| PB R-CNN | ECCV 2014 | Alex-Net | 224x224 | 73.9% | |||
| MaxEnt | NIPS 2018 | GoogLeNet | TBD | 74.4% | |||
| PB R-CNN | ECCV 2014 | ![]() |
Alex-Net | 224x224 | 76.4% | ||
| PS-CNN | CVPR 2016 | ![]() |
![]() |
CaffeNet | 454x454 | 76.6% | |
| MaxEnt | NIPS 2018 | VGG-16 | TBD | 77.0% | |||
| Mask-CNN | PR 2018 | ![]() |
Alex-Net | 448x448 | 78.6% | ||
| PC | ECCV 2018 | ResNet-50 | TBD | 80.2% | |||
| DeepLAC | CVPR 2015 | ![]() |
![]() |
Alex-Net | 227x227 | 80.3% | |
| MaxEnt | NIPS 2018 | ResNet-50 | TBD | 80.4% | |||
| Triplet-A | CVPR 2016 | ![]() |
Manual labour | GoogLeNet | TBD | 80.7% | |
| Multi-grained | ICCV 2015 | WordNet etc. | VGG-19 | 224x224 | 81.7% | ||
| Krause et al. | CVPR 2015 | ![]() |
CaffeNet | TBD | 82.0% | ||
| Multi-grained | ICCV 2015 | ![]() |
WordNet etc. | VGG-19 | 224x224 | 83.0% | |
| TS | CVPR 2016 | VGGD+VGGM | 448x448 | 84.0% | |||
| Bilinear CNN | ICCV 2015 | VGGD+VGGM | 448x448 | 84.1% | |||
| STN | NIPS 2015 | GoogLeNet+BN | 448x448 | 84.1% | |||
| LRBP | CVPR 2017 | VGG-16 | 224x224 | 84.2% | |||
| PDFS | CVPR 2016 | VGG-16 | TBD | 84.5% | |||
| Xu et al. | ICCV 2015 | ![]() |
![]() |
Web data | CaffeNet | 224x224 | 84.6% |
| Cai et al. | ICCV 2017 | VGG-16 | 448x448 | 85.3% | |||
| RA-CNN | CVPR 2017 | VGG-19 | 448x448 | 85.3% | |||
| MaxEnt | NIPS 2018 | Bilinear CNN | TBD | 85.3% | |||
| PC | ECCV 2018 | Bilinear CNN | TBD | 85.6% | |||
| CVL | CVPR 2017 | Texts | VGG | TBD | 85.6% | ||
| Mask-CNN | PR 2018 | ![]() |
VGG-16 | 448x448 | 85.7% | ||
| GP-256 | ECCV 2018 | VGG-16 | 448x448 | 85.8% | |||
| KP | CVPR 2017 | VGG-16 | 224x224 | 86.2% | |||
| T-CNN | IJCAI 2018 | ResNet | 224x224 | 86.2% | |||
| MA-CNN | ICCV 2017 | VGG-19 | 448x448 | 86.5% | |||
| MaxEnt | NIPS 2018 | DenseNet-161 | TBD | 86.5% | |||
| DeepKSPD | ECCV 2018 | VGG-19 | 448x448 | 86.5% | |||
| OSME+MAMC | ECCV 2018 | ResNet-101 | 448x448 | 86.5% | |||
| StackDRL | IJCAI 2018 | VGG-19 | 224x224 | 86.6% | |||
| DFL-CNN | CVPR 2018 | VGG-16 | 448x448 | 86.7% | |||
| PC | ECCV 2018 | DenseNet-161 | TBD | 86.9% | |||
| KERL | IJCAI 2018 | Attributes | VGG-16 | 224x224 | 87.0% | ||
| HBP | ECCV 2018 | VGG-16 | 448x448 | 87.1% | |||
| Mask-CNN | PR 2018 | ![]() |
ResNet-50 | 448x448 | 87.3% | ||
| DFL-CNN | CVPR 2018 | ResNet-50 | 448x448 | 87.4% | |||
| NTS-Net | ECCV 2018 | ResNet-50 | 448x448 | 87.5% | |||
| HSnet | CVPR 2017 | ![]() |
![]() |
GoogLeNet+BN | TBD | 87.5% | |
| MetaFGNet | ECCV 2018 | Auxiliary data | ResNet-34 | TBD | 87.6% | ||
| DCL | CVPR 2019 | ResNet-50 | 448x448 | 87.8% | |||
| TASN | CVPR 2019 | ResNet-50 | 448x448 | 87.9% | |||
| Ge et al. | CVPR 2019 | GoogLeNet+BN | Shorter side is 800 px | 90.4% |
Fine-Grained(细粒度) Image – Papers, Codes and Datasets的更多相关文章
- Matlab Codes and Datasets for Feature Learning
Matlab Codes and Datasets for Feature Learning 浙江大学CAiDeng提供的Matlab特征学习Code.
- CVPR 2015 papers
CVPR2015 Papers震撼来袭! CVPR 2015的文章可以下载了,如果链接无法下载,可以在Google上通过搜索paper名字下载(友情提示:可以使用filetype:pdf命令). Go ...
- KDD2015,Accepted Papers
Accepted Papers by Session Research Session RT01: Social and Graphs 1Tuesday 10:20 am–12:00 pm | Lev ...
- HDFS 细粒度锁优化,FusionInsight MRS有妙招
摘要:华为云FusionInsight MRS通过FGL对HDFS NameNode锁机制进行优化,有效提升了NameNode的读写吞吐量,从而能够支持更多数据,更多业务请求访问,从而更好的支撑政企客 ...
- cvpr2015papers
@http://www-cs-faculty.stanford.edu/people/karpathy/cvpr2015papers/ CVPR 2015 papers (in nicer forma ...
- Official Program for CVPR 2015
From: http://www.pamitc.org/cvpr15/program.php Official Program for CVPR 2015 Monday, June 8 8:30am ...
- 2016CVPR论文集
http://www.cv-foundation.org/openaccess/CVPR2016.py ORAL SESSION Image Captioning and Question Answe ...
- CVPR2016 Paper list
CVPR2016 Paper list ORAL SESSIONImage Captioning and Question Answering Monday, June 27th, 9:00AM - ...
- Cryptographic method and system
The present invention relates to the field of security of electronic data and/or communications. In ...
随机推荐
- 【算法随记五】使用FFT变换自动去除图像中严重的网纹。
这个课题在很久以前就已经有所接触,不过一直没有用代码去实现过.最近买了一本<机器视觉算法与应用第二版>书,书中再次提到该方法:使用傅里叶变换进行滤波处理的真正好处是可以通过使用定制的滤波器 ...
- 解决mac OSX下安装git出现的"git命令需要使用开发者工具。您要现在安装该工具吗"(19款Mac)
1.本地安装Git ,这里不做说明 2.命令行执行 sudo mv /usr/bin/git /usr/bin/git-system 3.如果提示 权限不足,操作不被允许,关闭Rootless,重启按 ...
- IDEA 学习笔记之 Web项目开发
Web项目开发: 添加新模块: 起名: 添加jars: 添加Tomcat/local: 添加项目: 启动Tomcat: 看到web页面: 修改页面: 重新部署页面:
- 基于动态代理的WebAPI/RPC/webSocket框架,一套接口定义,多个通讯方式
API/RPC/webSocket三个看起来好像没啥相同的地方,在开发时,服务端,客户端实现代码也大不一样 最近整理了一下,通过动态代理的形式,整合了这些开发,都通过统一的接口约束,服务端实现和客户端 ...
- js实现烟花效果
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
- Web高性能动画及渲染原理(1)CSS动画和JS动画
目录 一. CSS动画 和 JS动画 1.1 CSS动画 1.2 JS动画 1.3 小结 二. 使用Velocity.js实现动画 示例代码托管在:http://www.github.com/dash ...
- 同步与互斥_percpu变量
percpu变量的关键就是:要求根据CPU的个数,在内存中生成多份拷贝,并且能够根据变量名和CPU编号,正确的对各个CPU的变量进行寻址. 采用per-cpu变量有下列好处:所需数据很可能存在于处理器 ...
- Python函数参数与参数解构
1 Python中的函数 函数,从数学的角度来讲是,输入一个参数,经过一个表达式的处理后得到一个结果的输出,即就是x-->y的一个映射.同样,在Python或者任何编程语言中,函数其实就是实现一 ...
- http服务端架构演进
摘要 在详解http报文相关文章中我们介绍了http协议是如何工作的,那么构建一个真实的网站还需要引入组件呢?一些常见的名词到底是什么含义呢? 什么叫正向代理,什么叫反向代理 服务代理与负载均衡的差别 ...
- 不就是SELECT COUNT语句吗,竟然能被面试官虐的体无完肤
数据库查询相信很多人都不陌生,所有经常有人调侃程序员就是CRUD专员,这所谓的CRUD指的就是数据库的增删改查. 在数据库的增删改查操作中,使用最频繁的就是查询操作.而在所有查询操作中,统计数量操作更 ...
images