http://rogerioferis.com/VisualRecognitionAndSearch2014/Resources.html

Source Code

Non-exhaustive list of state-of-the-art implementations related to visual recognition and search. There is no warranty for the source code links below – use them at your own risk!

Feature Detection and Description

General Libraries:

Fast Keypoint Detectors for Real-time Applications:

  • FAST – High-speed corner detector implementation for a wide variety of platforms
  • AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV
    2010).

Binary Descriptors for Real-Time Applications:

  • BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
  • ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations,
    but not scale)
  • BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
  • FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

Other Local Feature Detectors and Descriptors:

  • VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
  • LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
  • Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and
    rendering style (CVPR 2012).

Global Image Descriptors:

  • GIST – Matlab code for the GIST descriptor
  • CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

  • VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including
    Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
  • Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

  • Caffe – Fast C++ implementation of deep convolutional networks (GPU / CPU / ImageNet 2013 demonstration).
  • id=software:overfeat:start" style="color:rgb(165,88,88)">OverFeat – C++ library for integrated classification and localization of objects.

  • EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on
    convolutional neural networks.
  • Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural
    networks.
  • Deep Learning - Various links for deep learning software.

Facial Feature Detection and Tracking

  • IntraFace – Very accurate detection and tracking of facial features (C++/Matlab API).

Part-Based Models

Attributes and Semantic Features

Large-Scale Learning

  • Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
  • LIBLINEAR – Library for large-scale linear SVM classification.
  • VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

  • FLANN – Library for performing fast approximate nearest neighbor.
  • Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
  • ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing
    (CVPR 2011).
  • INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

3D Recognition

Action Recognition


Datasets

Attributes

  • Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
  • aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
  • FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
  • PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
  • LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
  • Human Attributes – 8,000 people with annotated attributes. Check also this link for
    another dataset of human attributes.
  • SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
  • ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
  • Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for
    the WhittleSearch data.
  • Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Face Detection

  • FDDB – UMass face detection dataset and benchmark (5,000+ faces)
  • CMU/MIT – Classical face detection dataset.

Face Recognition

  • Face Recognition Homepage – Large collection of face recognition datasets.
  • LFW – UMass unconstrained face recognition dataset (13,000+ face images).
  • NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
  • CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
  • FERET – Classical face recognition dataset.
  • Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale,
    ORL, PIE, and Extended Yale B.
  • SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

  • MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Generic Object Recognition

  • ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
  • Tiny Images – 80 million 32x32 low resolution images.
  • Pascal VOC – One of the most influential visual recognition datasets.
  • Caltech 101 / Caltech
    256
     – Popular image datasets containing 101 and 256 object categories, respectively.
  • MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

Feature Detection and Description

  • VGG Affine Dataset – Widely used dataset for measuring performance of feature detection and description. CheckVLBenchmarksfor
    an evaluation framework.

Action Recognition

RGBD Recognition

state-of-the-art implementations related to visual recognition and search的更多相关文章

  1. Image Processing and Analysis_8_Edge Detection:Edge and line oriented contour detection State of the art ——2011

    此主要讨论图像处理与分析.虽然计算机视觉部分的有些内容比如特 征提取等也可以归结到图像分析中来,但鉴于它们与计算机视觉的紧密联系,以 及它们的出处,没有把它们纳入到图像处理与分析中来.同样,这里面也有 ...

  2. Convolutional Neural Networks for Visual Recognition

    http://cs231n.github.io/   里面有很多相当好的文章 http://cs231n.github.io/convolutional-networks/ Table of Cont ...

  3. 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015

    Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...

  4. 论文笔记之: Bilinear CNN Models for Fine-grained Visual Recognition

    Bilinear CNN Models for Fine-grained Visual Recognition CVPR 2015 本文提出了一种双线性模型( bilinear models),一种识 ...

  5. CNN for Visual Recognition (01)

    CS231n: Convolutional Neural Networks for Visual Recognitionhttp://vision.stanford.edu/teaching/cs23 ...

  6. 【论文阅读】Deep Mixture of Diverse Experts for Large-Scale Visual Recognition

    导读: 本文为论文<Deep Mixture of Diverse Experts for Large-Scale Visual Recognition>的阅读总结.目的是做大规模图像分类 ...

  7. 目标检测--Spatial pyramid pooling in deep convolutional networks for visual recognition(PAMI, 2015)

    Spatial pyramid pooling in deep convolutional networks for visual recognition 作者: Kaiming He, Xiangy ...

  8. A Theoretical Analysis of Feature Pooling in Visual Recognition

    这篇是10年ICML的论文,但是它是从原理上来分析池化的原因,因为池化的好坏的确会影响到结果,比如有除了最大池化和均值池化,还有随机池化等等,在eccv14中海油在顶层加个空间金字塔池化的方法.可谓多 ...

  9. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zh ...

随机推荐

  1. ubuntu 12.10 软件更新源列表

    ubuntu 12.10正式版已经发布了,国内各大开源软件源也陆续更新了资源.今天分享一下ubuntu 12.10 软件更新源列表. 首先,习惯性的备份一下ubuntu 12.04 原来的源地址列表文 ...

  2. Reset and clock control (RCC) STM32L

    Reset: 1.系统复位:A system reset sets all registers to their reset values except for the RTC, RTC backup ...

  3. svn代码统计工具的金额

    StatSVN介绍 StatSVN是Java写开源统计程序,从statCVS从移植.从能Subversion版本号来获取信息库,该项目开发的叙述性说明,然后生成各种表格和图表.例:时间线.针对每一个开 ...

  4. ZOJ 3795 Grouping 求最长链序列露点拓扑

    意甲冠军:特定n积分.m向边条. 该点被划分成多个集合随机的每个集合,使得2问题的关键是无法访问(集合只能容纳一个点) 问至少需要被分成几个集合. 假设没有戒指,接着这个话题正在寻求产业链最长的一个有 ...

  5. ProductHunt,TechCrunch和AppStore的差的值

    ProductHunt(产品狩猎)硅谷社区的新产品,起初只存在一个技术性的房子维修.然后进入YC训练营已经收到了几百美元的融资2. 这款产品的形式非常easy.粗产物似乎是一个节目的部位,加上一些评论 ...

  6. WPF Media 简单的播放器

    <Window x:Class="PlayTest.MediaControl" xmlns="http://schemas.microsoft.com/winfx/ ...

  7. 设置Windows 8.1屏幕自己主动旋转代码, Auto-rotate function code

    程序代码实现启用或禁用Windows 8.1 Tablet的自己主动旋转功能 方法一:使用SetDisplayAutoRotationPreferences函数功能 #include <Wind ...

  8. 玩转html5(三)---智能表单(form),使排版更加方便

    <!DOCTYPE html> <head> <meta http-equiv="Content-Type" content="text/h ...

  9. 【原创】POJ 1703 && RQNOJ 能量项链解题报告

    唉 不想说什么了 poj 1703,从看完题到写完第一个版本的代码,只有15分钟 然后一直从晚上八点WA到第二天早上 最后终于发现了BUG,题目要求的“Not sure yet.”,我打成了“No s ...

  10. 数据结构 - 双链表(C++)

    // ------DoublyLinkedList.h------ template <class T> class DNode { private: // 指向左.右结点的指针 DNod ...