Efficient k-NN search on vertically decomposed data

  • Authors:
  • Arjen P. de Vries;Nikos Mamoulis;Niels Nes;Martin Kersten

  • Affiliations:
  • Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ, Amsterdam, The Netherlands;University of Hong Kong, Pokfulam Road, Hong Kong;Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ, Amsterdam, The Netherlands;Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ, Amsterdam, The Netherlands

  • Venue:
  • Proceedings of the 2002 ACM SIGMOD international conference on Management of data
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications like multimedia retrieval require efficient support for similarity search on large data collections. Yet, nearest neighbor search is a difficult problem in high dimensional spaces, rendering efficient applications hard to realize: index structures degrade rapidly with increasing dimensionality, while sequential search is not an attractive solution for repositories with millions of objects. This paper approaches the problem from a different angle. A solution is sought in an unconventional storage scheme, that opens up a new range of techniques for processing k-NN queries, especially suited for high dimensional spaces. The suggested (physical) database design accommodates well a novel variant of branch-and-bound search, that reduces the high dimensional space quickly to a small candidate set. The paper provides insight in applying this idea to k-NN search using two similarity metrics commonly encountered in image database applications, and discusses techniques for its implementation in relational database systems. The effectiveness of the proposed method is evaluated empirically on both real and synthetic data sets, reporting the significant improvements in response time yielded.