An Adjustable Description Quality Measure for Pattern Discovery Usingthe AQ Methodology

  • Authors:
  • Kenneth A. Kaufman;Ryszard S. Michalski

  • Affiliations:
  • Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA 22030. kaufman@mli.gmu.edu;Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA&semi/ Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland. michalski@mli.gmu.edu

  • Venue:
  • Journal of Intelligent Information Systems - Special issue on methodologies for intelligent information systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In concept learning and data mining tasks, the learner istypically faced with a choice of many possible hypotheses or patternscharacterizing the input data. If one can assume that training datacontain no noise, then the primary conditions a hypothesis mustsatisfy are consistency and completeness with regard to the data. Inreal-world applications, however, data are often noisy, and theinsistence on the full completeness and consistency of the hypothesisis no longer valid. In such situations, the problem is to determine ahypothesis that represents the best trade-off between completenessand consistency. This paper presents an approach to this problem inwhich a learner seeks rules optimizing a rule qualitycriterion that combines the rule coverage (a measure ofcompleteness) and training accuracy (a measure of inconsistency).These factors are combined into a single rule quality measure througha lexicographical evaluation functional (LEF). The method hasbeen implemented in the AQ18 learning system for natural inductionand pattern discovery, and compared with several other methods.Experiments have shown that the proposed method can be easilytailored to different problems and can simulate different rulelearners by modifying the parameter of the rule quality criterion.