A large-scale performance study of cluster-based high-dimensional indexing

Authors:
Gylfi Þór Gudmundsson;Björn Þór Jónsson;Laurent Amsaleg
Affiliations:
INRIA Rennes, Rennes, France;Reykjavik University, Reykjavik, Iceland;CNRS - IRISA, Rennes, France
Venue:
Proceedings of the international workshop on Very-large-scale multimedia corpus, mining and retrieval
Year:
2010

Citing 8
Cited 2

Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
An efficient parts-based near-duplicate and sub-image retrieval system

Proceedings of the 12th annual ACM international conference on Multimedia
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
The Quality vs. Time Trade-off for Approximate Image Descriptor Search

ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
Scalability of local image descriptors: a comparative study

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Finding near neighbors through cluster pruning

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

Impact of storage technology on the efficiency of cluster-based high-dimensional index creation

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
Indexing and searching 100M images with map-reduce

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

High-dimensional clustering is used by some content-based image retrieval systems to partition the data into groups; the groups (clusters) are then indexed to accelerate processing of queries. Recently, the Cluster Pruning approach was proposed as a simple way to produce such clusters. While the original evaluation of the algorithm was performed within a text indexing context at a rather small scale, its simplicity motivated us to study its behavior in an image indexing context at a much larger scale. This paper summarizes the results of this study and shows that while the basic algorithm works fairly well, three extensions dramatically improve its performance and scalability, accelerating both query processing and the construction of clusters, making Cluster Pruning a promising basis for building large-scale systems that require a clustering algorithm.