Detecting a Compact Decision Tree Based on an Appropriate Abstraction

  • Authors:
  • Yoshimitsu Kudoh;Makoto Haraguchi

  • Affiliations:
  • -;-

  • Venue:
  • IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is generally convinced that pre-processing for data mining is needed to exclude irrelevant and meaningless aspects of data before applying data mining algorithms. From this viewpoint, we have already proposcd a notion of Information Theoretical Abstraction, and implemented a system ITA. Given a relational database and a family of possible abstractions for its attribute values, called an anstraction hierarchy, ITA selects the best abstraction among the possible ones so that class disatribution needed to perform our classification task arc preserved as possibly as we can. According to our previous experiment, just one application of abstraction for the whole database has shown its effectiveness in reducing the size of detected rules, without making the classification error worse. However, as C4.5 performs serial attribute-selection repeatedly, ITA does not generally guarantee the preservingness of class distributions, given a sequence of attribute-selections. For this reason, in this paper, we propose a new version of ITA, called iterntizie ITA, so that it tries to keep the class distributions in each attribute selection step as possibly as we call.