Constructing appropriate data abstractions for mining classification knowledge

  • Authors:
  • Yoshiaki Okubo;Yoshimitsu Kudoh;Makoto Haraguchi

  • Affiliations:
  • Division of Electronics and Information Engineering, Hokkaido University, Sapporo, Japan;Division of Electronics and Information Engineering, Hokkaido University, Sapporo, Japan;Division of Electronics and Information Engineering, Hokkaido University, Sapporo, Japan

  • Venue:
  • INAP'01 Proceedings of the Applications of prolog 14th international conference on Web knowledge management and decision support
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

A notion of data abstraction is very useful for discovering concise knowledge from large databases. For classification problems, we have previously proposed criterions for selecting useful abstractions from a set of given candidates and developed a family of data abstraction systems, called ITA, iterative ITA and I2TA [5,6,7]. In order to make our systems more flexible, this paper tries to construct useful abstractions from scratch. Since a data abstraction can be represented as a partition of possible attribute values, our search space for the construction consists of a huge number of possible candidates in general. In order to reduce the search space, we introduce an ordering on abstractions and present a pruning method based on the ordering. Furthermore, we propose to make use of hierarchical structure among attribute values extracted from a dictionary in order to reject meaningless candidates. Our search can be constrained by upper and lower-bounds extracted from the dictionary. Preliminary experimental results show that the number of candidates can be reduced drastically with the help of the dictionary.