A large-scale performance study of cluster-based high-dimensional indexing

  • Authors:
  • Gylfi Þór Gudmundsson;Björn Þór Jónsson;Laurent Amsaleg

  • Affiliations:
  • INRIA Rennes, Rennes, France;Reykjavik University, Reykjavik, Iceland;CNRS - IRISA, Rennes, France

  • Venue:
  • Proceedings of the international workshop on Very-large-scale multimedia corpus, mining and retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-dimensional clustering is used by some content-based image retrieval systems to partition the data into groups; the groups (clusters) are then indexed to accelerate processing of queries. Recently, the Cluster Pruning approach was proposed as a simple way to produce such clusters. While the original evaluation of the algorithm was performed within a text indexing context at a rather small scale, its simplicity motivated us to study its behavior in an image indexing context at a much larger scale. This paper summarizes the results of this study and shows that while the basic algorithm works fairly well, three extensions dramatically improve its performance and scalability, accelerating both query processing and the construction of clusters, making Cluster Pruning a promising basis for building large-scale systems that require a clustering algorithm.