Typical Example Selection for Learning Classifiers

  • Authors:
  • Jianchao Han;Nick Cercone

  • Affiliations:
  • -;-

  • Venue:
  • AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of learning classifiers with a small labeled example set and a large unlabeled example set. This situation arises in many applications, e.g., identifying medical images, webpages, sensing data, etc. where it is hard and expensive to label the examples while it is much easier to acquire unlabeled examples. We suppose that the training data is distributed in the mixture model with Gaussian components. An approach to selecting typical examples for learning classifiers is proposed, and the typicality measure is defined with respect to the labeled data according to the Mahalanobis squared distance. The algorithm for selecting typical examples is described. The basic idea is that a training example is randomly drawn, and its typicality is measured. If the typicality is greater than the threshold, then the training example is sampled. The number of typical examples sampled is limited to memory capacity.