Average-case analysis of a nearest neighbor algorthim

  • Authors:
  • Pat Langley;Wayne Iba

  • Affiliations:
  • Learning Systems Department, Siemens Corporate Research, Princeton, NJ;NASA Ames Research Center, Moffett Field, CA

  • Venue:
  • IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an average-case analysis of the nearest neighbor algorithm, a simple induction method that has been studied by many researchers. Our analysis assumes a conjunctive target concept, noise-free Boolean attributes, and a uniform distribution over the instance space. We calculate the probability that the algorithm will encounter a test instance that is distance d from the prototype of the concept, along with the probability that the nearest stored training case is distance e from this test instance. From this we compute the probability of correct classification as a function of the number of observed training cases, the number of relevant attributes, and the number of irrelevant attributes. We also explore the behavioral implications of the analysis by presenting predicted learning curves for artificial domains, and give experimental results on these domains as a check on our reasoning.