Indexing Schemes for Similarity Search: an Illustrated Paradigm

  • Authors:
  • Vladimir Pestov;Aleksandar Stojmirović/

  • Affiliations:
  • Department of Mathematics and Statistics University of Ottawa, Ontario, Canada. E-mails: vpest283@uottawa.ca/ astojmir@uottawa.ca;Department of Mathematics and Statistics University of Ottawa, Ontario, Canada. E-mails: vpest283@uottawa.ca/ astojmir@uottawa.ca

  • Venue:
  • Fundamenta Informaticae
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We suggest a variation of the Hellerstein - Koutsoupias - Papadimitriou indexability model for datasets equipped with a similarity measure, with the aim of better understanding the structure of indexing schemes for similarity-based search and the geometry of similarity workloads. This in particular provides a unified approach to a great variety of schemes used to index into metric spaces and facilitates their transfer to more general similarity measures such as quasi-metrics. We discuss links between performance of indexing schemes and high-dimensional geometry. The concepts and results are illustrated on a very large concrete dataset of peptide fragments equipped with a biologically significant similarity measure.