High-dimensional descriptor indexing for large multimedia databases

  • Authors:
  • Eduardo Valle;Matthieu Cord;Sylvie Philipp-Foliguet

  • Affiliations:
  • ETIS UMR CNRS 8051, Cergy-Pontoise, France;LIP6 - UPMC, Paris, France;ETIS UMR CNRS 8051, Cergy-Pontoise, France

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we address the subject of large multimedia database indexing for content-based retrieval. We introduce multicurves, a new scheme for indexing high-dimensional descriptors. This technique, based on the simultaneous use of moderate-dimensional space-filling curves, has as main advantages the ability to handle high-dimensional data (100 dimensions and over), to allow the easy maintenance of the indexes (inclusion and deletion of data), and to adapt well to secondary storage, thus providing scalability to huge databases (millions, or even thousands of millions of descriptors). We use multicurves to perform the approximate k nearest neighbors search with a very good compromise between precision and speed. The evaluation of multicurves, carried out on large databases, demonstrates that the strategy compares well to other up-to-date k nearest neighbor search strategies. We also test multicurves on the real-world application of image identification for cultural institutions. In this application, which requires the fast search of a large amount of local descriptors, multicurves allows a dramatic speed-up in comparison to the brute-force strategy of sequential search, without any noticeable precision loss.