Algorithms for clustering data
Algorithms for clustering data
C4.5: programs for machine learning
C4.5: programs for machine learning
Attribute-oriented induction in data mining
Advances in knowledge discovery and data mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Some Criterions for Selecting the Best Data Abstractions
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Cluster Analysis
Data Abstractions for Numerical Attributes in Data Mining
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Hi-index | 0.00 |
A notion of data abstraction is very useful for discovering concise knowledge from large databases. For classification problems, we have previously proposed criterions for selecting useful abstractions from a set of given candidates and developed a family of data abstraction systems, called ITA, iterative ITA and I2TA [5,6,7]. In order to make our systems more flexible, this paper tries to construct useful abstractions from scratch. Since a data abstraction can be represented as a partition of possible attribute values, our search space for the construction consists of a huge number of possible candidates in general. In order to reduce the search space, we introduce an ordering on abstractions and present a pruning method based on the ordering. Furthermore, we propose to make use of hierarchical structure among attribute values extracted from a dictionary in order to reject meaningless candidates. Our search can be constrained by upper and lower-bounds extracted from the dictionary. Preliminary experimental results show that the number of candidates can be reduced drastically with the help of the dictionary.