An Appropriate Abstraction for an Attribute-Oriented Induction

  • Authors:
  • Yoshimitsu Kudoh;Makoto Haraguchi

  • Affiliations:
  • -;-

  • Venue:
  • DS '99 Proceedings of the Second International Conference on Discovery Science
  • Year:
  • 1999

Quantified Score

Hi-index 0.01

Visualization

Abstract

An attribute-oriented induction is a useful data mining method that generalizes databases under an appropriate abstraction hierarchy to extract meaningful knowledge. The hierarchy is well designed so as to exclude meaningless rules from a particular point of view. However, there may exist several ways of generalizing databases according to user's intention. It is therefore important to provide a multi-layered abstraction hierarchy under which several generalizations are possible and are well controlled. In fact, too-general or too-specific databases are inappropriate for mining algorithms to extract significant rules. From this viewpoint, this paper proposes a generalization method based on an information theoretical measure to select an appropriate abstraction hierarchy. Furthermore, we present a system, called ITA (Information Theoretical Abstraction), based on our method and an attribute-oriented induction. We perform some practical experiments in which ITA discovers meaningful rules from a census database US Census Bureau and discuss the validity of ITA based on the experimental results.