Concept learning and the problem of small disjuncts

  • Authors:
  • Robert C. Holte;Liane E. Acker;Bruce W. Porter

  • Affiliations:
  • Computer Science Department, University of Ottawa, Ottawa, Canada;Department of Computer Sciences, University of Texas at Austin, Austin, Texas;Department of Computer Sciences, University of Texas at Austin, Austin, Texas

  • Venue:
  • IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ideally, definitions induced from examples should consist of all, and only, disjuncts that are meaningful (e.g., as measured by a statistical significance test) and have a low error rate. Existing inductive systems create definitions that are ideal with regard to large disjuncts, but far from ideal with regard to small disjuncts, where a small (large) disjunct is one that correctly classifies few (many) training examples. The problem with small disjuncts is that many of them have high rates of misclassification, and it is difficult to eliminate the errorprone small disjuncts from a definition without adversely affecting other disjuncts in the definition. Various approaches to this problem are evaluated, including the novel approach of using a bias different than the "maximum generality" bias. This approach, and some others, prove partly successful, but the problem of small disjuncts remains open.