A Simple Algorithm for Nearest Neighbor Search in High Dimensions

  • Authors:
  • Sameer A. Nene;Shree K. Nayar

  • Affiliations:
  • Columbia Univ., New York, NY;Columbia Univ., New York, NY

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 1997

Quantified Score

Hi-index 0.14

Visualization

Abstract

The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In nearly all applications, the closest point is of interest only if it lies within a user-specified distance $\epsilon.$ We present a simple and practical algorithm to efficiently search for the nearest neighbor within Euclidean distance $\epsilon.$ The use of projection search combined with a novel data structure dramatically improves performance in high dimensions. A complexity analysis is presented which helps to automatically determine $\epsilon$ in structured problems. A comprehensive set of benchmarks clearly shows the superiority of the proposed algorithm for a variety of structured and unstructured search problems. Object recognition is demonstrated as an example application. The simplicity of the algorithm makes it possible to construct an inexpensive hardware search engine which can be 100 times faster than its software equivalent. A C++ implementation of our algorithm is available upon request to search@cs.columbia.edu/CAVE/.