On Optimizing Nearest Neighbor Queries in High-Dimensional Data Spaces

  • Authors:
  • Stefan Berchtold;Christian Böhm;Daniel A. Keim;Florian Krebs;Hans-Peter Kriegel

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICDT '01 Proceedings of the 8th International Conference on Database Theory
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nearest-neighbor queries in high-dimensional space are of high importance in various applications, especially in content-based indexing of multimedia data. For an optimization of the query processing, accurate models for estimating the query processing costs are needed. In this paper, we propose a new cost model for nearest neighbor queries in high-dimensional space, which we apply to enhance the performance of high-dimensional index structures. The model is based on new insights into effects occurring in high-dimensional space and provides a closed formula for the processing costs of nearest neighbor queries depending on the dimensionality, the block size and the database size. From the wide range of possible applications of our model, we select two interesting samples: First, we use the model to prove the known linear complexity of the nearest neighbor search problem in high-dimensional space, and second, we provide a technique for optimizing the block size. For data of medium dimensionality, the optimized block size allows significant speed-ups of the query processing time when compared to traditional block sizes and to the linear scan.