New instability results for high-dimensional nearest neighbor search

  • Authors:
  • Chris Giannella

  • Affiliations:
  • The MITRE Corporation, Hanover, MD, USA

  • Venue:
  • Information Processing Letters
  • Year:
  • 2009

Quantified Score

Hi-index 0.89

Visualization

Abstract

Consider a dataset of n(d) points generated independently from R^d according to a common p.d.f. f"d with support(f"d)=[0,1]^d and sup{f"d(R^d)} growing sub-exponentially in d. We prove that: (i) if n(d) grows sub-exponentially in d, then, for any query point q-@?[0,1]^d and any @e0, the ratio of the distance between any two dataset points and q- is less that 1+@e with probability -1 as d-~; (ii) if n(d)[4(1+@e)]^d for large d, then for all q-@?[0,1]^d (except a small subset) and any @e0, the distance ratio is less than 1+@e with limiting probability strictly bounded away from one. Moreover, we provide preliminary results along the lines of (i) when f"d=N(@m-"d,@S"d).