Bellman strikes again!: the growth rate of sample complexity with dimension for the nearest neighbor classifier

  • Authors:
  • Santosh S. Venkatesh;Robert R. Snapp;Demetri Psaltis

  • Affiliations:
  • Electrical Engineering Department, University of Pennsylvania, Philadelphia, PA;Department of Computer Science and Electrical Engineering, University of Vermont, Burlington, VT;Electrical Engineering Department, California Institute of Technology, Pasadena, CA

  • Venue:
  • COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

The finite sample performance of a nearest neighbor classifier is analyzed for a two-class pattern recognition problem. An exact integral expression is derived for the m-sample risk Rm given that a reference m-sample of labeled points, drawn independently from Euclidean n-space according to a fixed probability distri bution, is available to the classifier. For a family of smooth distributions, it is shown that the m-sample risk Rm has a complete asymptotic expansion ******, where ** denotes the nearest neighbor risk in the infinite sample limit. Explicit definitions of the expansion coefficents are given in terms of the underlying distribution. As the convergence rate of **** dramatically slows down as n increases, this analysis provides an analytic validation of Bellman's curse of dimensionality. Numerical simulations corroborating the formal results are included. The rates of convergence for less restrictive families of distributions are also discussed.