Universal indexing of arbitrary similarity models

  • Authors:
  • Tomáš Bartoš

  • Affiliations:
  • Charles University in Prague, Faculty of Mathematics and Physics, SIRET Research Group, Prague 1, Czech Republic

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing amount of available unstructured content together with the growing number of large nonrelational databases put more emphasis on the content-based retrieval and precisely on the area of similarity searching. Although there exist several indexing methods for efficient querying, not all of them are best-suited for arbitrary similarity models. Having a metric space, we can easily apply metric access methods but for nonmetric models which typically better describe similarities between generally unstructured objects the situation is a little bit more complicated. To address this challenge, we introduce SIMDEX, the universal framework that is capable of finding alternative indexing methods that will serve for efficient yet effective similarity searching for any similarity model. Using trivial or more advanced methods for the incremental exploration of possible indexing techniques, we are able to find alternative methods to the widely used metric space model paradigm. Through experimental evaluations, we validate our approach and show how it outperforms the known indexing methods.