Effects of domain characteristics on instance-based learning algorithms

  • Authors:
  • Seishi Okamoto;Nobuhiro Yugami

  • Affiliations:
  • Fujitsu Laboratories, 1-9-3 Nakase, Mihama-ku, 213-8588 Chiba, Japan;Fujitsu Laboratories, 1-9-3 Nakase, Mihama-ku, 213-8588 Chiba, Japan

  • Venue:
  • Theoretical Computer Science - Selected papers in honour of Setsuo Arikawa
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents average-case analyses of instance-based learning algorithms. The algorithms analyzed employ a variant of k-nearest neighbor classifier (k-NN). Our analysis deals with a monotone m-of-n target concept with irrelevant attributes, and handles three types of noise: relevant attribute noise, irrelevant attribute noise, and class noise. We formally represent the expected classification accuracy of k-NN as a function of domain characteristics including the number of training instances, the number of relevant and irrelevant attributes, the threshold number in the target concept, the probability of each attribute, the noise rate for each type of noise, and k. We also explore the behavioral implications of the analyses by presenting the effects of domain characteristics on the expected accuracy of k-NN and on the optimal value of k for artificial domains.