http://rogerioferis.com/VisualRecognitionAndSearch2014/Resources.html

Source Code

Non-exhaustive list of state-of-the-art implementations related to visual recognition and search. There is no warranty for the source code links below – use them at your own risk!

Feature Detection and Description

General Libraries:

Fast Keypoint Detectors for Real-time Applications:

  • FAST – High-speed corner detector implementation for a wide variety of platforms
  • AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV
    2010).

Binary Descriptors for Real-Time Applications:

  • BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
  • ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations,
    but not scale)
  • BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
  • FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

Other Local Feature Detectors and Descriptors:

  • VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
  • LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
  • Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and
    rendering style (CVPR 2012).

Global Image Descriptors:

  • GIST – Matlab code for the GIST descriptor
  • CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

  • VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including
    Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
  • Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

  • Caffe – Fast C++ implementation of deep convolutional networks (GPU / CPU / ImageNet 2013 demonstration).
  • id=software:overfeat:start" style="color:rgb(165,88,88)">OverFeat – C++ library for integrated classification and localization of objects.

  • EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on
    convolutional neural networks.
  • Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural
    networks.
  • Deep Learning - Various links for deep learning software.

Facial Feature Detection and Tracking

  • IntraFace – Very accurate detection and tracking of facial features (C++/Matlab API).

Part-Based Models

Attributes and Semantic Features

Large-Scale Learning

  • Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
  • LIBLINEAR – Library for large-scale linear SVM classification.
  • VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

  • FLANN – Library for performing fast approximate nearest neighbor.
  • Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
  • ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing
    (CVPR 2011).
  • INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

3D Recognition

Action Recognition


Datasets

Attributes

  • Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
  • aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
  • FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
  • PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
  • LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
  • Human Attributes – 8,000 people with annotated attributes. Check also this link for
    another dataset of human attributes.
  • SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
  • ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
  • Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for
    the WhittleSearch data.
  • Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Face Detection

  • FDDB – UMass face detection dataset and benchmark (5,000+ faces)
  • CMU/MIT – Classical face detection dataset.

Face Recognition

  • Face Recognition Homepage – Large collection of face recognition datasets.
  • LFW – UMass unconstrained face recognition dataset (13,000+ face images).
  • NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
  • CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
  • FERET – Classical face recognition dataset.
  • Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale,
    ORL, PIE, and Extended Yale B.
  • SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

  • MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Generic Object Recognition

  • ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
  • Tiny Images – 80 million 32x32 low resolution images.
  • Pascal VOC – One of the most influential visual recognition datasets.
  • Caltech 101 / Caltech
    256
     – Popular image datasets containing 101 and 256 object categories, respectively.
  • MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

Feature Detection and Description

  • VGG Affine Dataset – Widely used dataset for measuring performance of feature detection and description. CheckVLBenchmarksfor
    an evaluation framework.

Action Recognition

RGBD Recognition

state-of-the-art implementations related to visual recognition and search的更多相关文章

  1. Image Processing and Analysis_8_Edge Detection:Edge and line oriented contour detection State of the art ——2011

    此主要讨论图像处理与分析.虽然计算机视觉部分的有些内容比如特 征提取等也可以归结到图像分析中来,但鉴于它们与计算机视觉的紧密联系,以 及它们的出处,没有把它们纳入到图像处理与分析中来.同样,这里面也有 ...

  2. Convolutional Neural Networks for Visual Recognition

    http://cs231n.github.io/   里面有很多相当好的文章 http://cs231n.github.io/convolutional-networks/ Table of Cont ...

  3. 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015

    Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...

  4. 论文笔记之: Bilinear CNN Models for Fine-grained Visual Recognition

    Bilinear CNN Models for Fine-grained Visual Recognition CVPR 2015 本文提出了一种双线性模型( bilinear models),一种识 ...

  5. CNN for Visual Recognition (01)

    CS231n: Convolutional Neural Networks for Visual Recognitionhttp://vision.stanford.edu/teaching/cs23 ...

  6. 【论文阅读】Deep Mixture of Diverse Experts for Large-Scale Visual Recognition

    导读: 本文为论文<Deep Mixture of Diverse Experts for Large-Scale Visual Recognition>的阅读总结.目的是做大规模图像分类 ...

  7. 目标检测--Spatial pyramid pooling in deep convolutional networks for visual recognition(PAMI, 2015)

    Spatial pyramid pooling in deep convolutional networks for visual recognition 作者: Kaiming He, Xiangy ...

  8. A Theoretical Analysis of Feature Pooling in Visual Recognition

    这篇是10年ICML的论文,但是它是从原理上来分析池化的原因,因为池化的好坏的确会影响到结果,比如有除了最大池化和均值池化,还有随机池化等等,在eccv14中海油在顶层加个空间金字塔池化的方法.可谓多 ...

  9. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zh ...

随机推荐

  1. 【SSH 基础】浅谈Hibernate关系映射(4)

    继上篇博客 多对多关联映射(单向) 多对多对象关系映射,须要增加一张新表完毕基本映射. Hibernate会自己主动生成中间表 Hibernate使用many-to-many标签来表示多对多的关联,多 ...

  2. java文字转成拼音

    package com.jframe.kit; import net.sourceforge.pinyin4j.PinyinHelper; import net.sourceforge.pinyin4 ...

  3. crm工作机会实体

    using System;     using Microsoft.Xrm.Sdk;     using Microsoft.Crm.Sdk.Messages; public class Opport ...

  4. SQL Server无法连接到(local)问题的解决的方法

    今天在使用数据库的时候突然发现,SQL Server08竟然连接不上了.问题如图所看到的: 于是在网上搜索了一下这个问题,发现有非常多相似的提问,既然这个问题不是少数人遇到,看来这个问题还是值得研究一 ...

  5. 怎么做fastreport使用离线数据源

    近期使用做项目发现fastreport使用在线数据源.紧密耦合的数据库连接字符串.在部署稍加注意.easy错误.因此,是否想到脱机使用的数据源. 官方参考: watermark/2/text/aHR0 ...

  6. Understanding responsibilities is key to good object-oriented design(转)

    对象和数据的主要差别就是对象有行为,行为可以看成责任职责(responsibilities以下简称职责)的一种,理解职责是实现好的OO设计的关键.“Understanding responsibili ...

  7. Jetty安装学习并展示

    Jetty 的基本架构 Jetty 眼下的是一个比較被看好的 Servlet 引擎,它的架构比較简单,也是一个可扩展性和很灵活的应用server,它有一个基本数据模型,这个数据模型就是 Handler ...

  8. Ising模型(伊辛模型)

    Ising模型(伊辛模型)是一个最简单且能够提供非常丰富的物理内容的模型.可用于描写叙述非常多物理现象,如:合金中的有序-无序转变.液氦到超流态的转变.液体的冻结与蒸发.玻璃物质的性质.森林火灾.城市 ...

  9. APUE学习总结

    简介 本文总结了个人,一个数字,对应称号<APUE>第一版的每一章,但是,独立的二级标题和书,人需求进行编写. 3.文件I/O 本章所说明的函数常常被称之为不带缓存的I/O(与第5章中说明 ...

  10. 栈实现java

    栈是一种“先去后出”的抽象的数据结构.例如:我们在洗盘子的时候,洗完一个盘子,将其放在一摞盘子的最上面,但我们全部洗完后,要是有盘子时,我们会先从最上面的盘子开始使用,这种例子就像栈的数据结构一样,先 ...