Empirical Learning as a Function of Concept Character

  • Authors:
  • Larry Rendell;Howard Cho

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana-Champaign, 1304 W. Springfield Avenue, Urbana, Illinois 61801 U.S.A. RENDELL@CS.UIUC.EDU;Department of Computer Science, University of Illinois at Urbana-Champaign, 1304 W. Springfield Avenue, Urbana, Illinois 61801 U.S.A. HCHO@GONDOR.CS.PSU.EDU

  • Venue:
  • Machine Learning
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

Concept learning depends on data character. To discover how, some researchers have used theoretical analysis to relate the behavior of idealized learning algorithms to classes of concepts. Others have developed pragmatic measures that relate the behavior of empirical systems such as ID3 and PLS1 to the kinds of concepts encountered in practice. But before learning behavior can be predicted, concepts and data must be characterized. Data characteristics include their number, error, “size,” and so forth. Although potential characteristics are numerous, they are constrained by the way one views concepts. Viewing concepts as functions over instance space leads to geometric characteristics such as concept size (the proportion of positive instances) and concentration (not too many “peaks”). Experiments show that some of these characteristics drastically affect the accuracy of concept learning. Sometimes data characteristics interact in non-intuitive ways; for example, noisy data may degrade accuracy differently depending on the size of the concept. Compared with effects of some data characteristics, the choice of learning algorithm appears less important: performance accuracy is degraded only slightly when the splitting criterion is replaced with random selection. Analyzing such observations suggests directions for concept learning research.