Aggregating Local Image Descriptors into Compact Codes

Authors:
Herve Jegou;Florent Perronnin;Matthijs Douze;Jorge Sánchez;Patrick Perez;Cordelia Schmid
Affiliations:
INRIA, Rennes;Xerox Research Centre Europe, Grenoble;INRIA, Rhone-Alpes;National University of Cordoba;Technicolor Research and Innovation;INRIA, Grenoble
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2012

Citing 0
Cited 16

Metric learning for large scale image classification: generalizing to new classes at near-zero cost

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Data-driven vehicle identification by image matching

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
Efficient image signatures and similarities using tensor products of local descriptors

Computer Vision and Image Understanding
Retrieving geo-location of videos with a divide & conquer hierarchical multimodal approach

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Signature matching distance for content-based image retrieval

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Tag completion based on belief theory and neighbor voting

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Online multimodal deep similarity learning with application to image retrieval

Proceedings of the 21st ACM international conference on Multimedia
Revisiting the VLAD image representation

Proceedings of the 21st ACM international conference on Multimedia
Activity detection and recognition of daily living events

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
Search-based relevance association with auxiliary contextual cues

Proceedings of the 21st ACM international conference on Multimedia
Weighted visual vocabulary to balance the descriptive ability on general dataset

Neurocomputing
Spatially aware feature selection and weighting for object retrieval

Image and Vision Computing
Evaluating multimedia features and fusion for example-based event detection

Machine Vision and Applications
Image Classification with the Fisher Vector: Theory and Practice

International Journal of Computer Vision
Compact vectors of locally aggregated tensors for 3D shape retrieval

3DOR '13 Proceedings of the Sixth Eurographics Workshop on 3D Object Retrieval

Quantified Score

Hi-index	0.14

Visualization

Abstract

This paper addresses the problem of large-scale image search. Three constraints have to be taken into account: search accuracy, efficiency, and memory usage. We first present and evaluate different ways of aggregating local image descriptors into a vector and show that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension. We then jointly optimize dimensionality reduction and indexing in order to obtain a precise vector comparison as well as a compact representation. The evaluation shows that the image representation can be reduced to a few dozen bytes while preserving high accuracy. Searching a 100 million image data set takes about 250 ms on one processor core.