A Probabilistic Spell for the Curse of Dimensionality

  • Authors:
  • Edgar Chávez;Gonzalo Navarro

  • Affiliations:
  • -;-

  • Venue:
  • ALENEX '01 Revised Papers from the Third International Workshop on Algorithm Engineering and Experimentation
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Range searches in metric spaces can be very difficult if the space is "high dimensional", i.e. when the histogram of distances has a large mean and/or a small variance. This so-called "curse of dimensionality", well known in vector spaces, is also observed in metric spaces. There are at least two reasons behind the curse of dimensionality: a large search radius and/or a high intrinsic dimension of the metric space. We present a general probabilistic framework based on stretching the triangle inequality, whose direct effect is a reduction of the effective search radius. The technique gets more effective as the dimension grows, and the basic principle can be applied to any search algorithm. In this paper we apply it to a particular class of indexing algorithms. We present an analysis which helps understand the process, as well as empirical evidence showing dramatic improvements in the search time at the cost of a very small error probability.