Beam search induction and similarity constraints for predictive clustering trees

Authors:
Dragi Kocev;Jan Struyf;Sašo Džeroski
Affiliations:
Dept. of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia;Dept. of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium;Dept. of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Venue:
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Year:
2006

Citing 9
Cited 3

C4.5: programs for machine learning

C4.5: programs for machine learning
A database perspective on knowledge discovery

Communications of the ACM
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
Building Decision Trees with Constraints

Data Mining and Knowledge Discovery
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A perspective on inductive databases

ACM SIGKDD Explorations Newsletter
Analysis of time series data with predictive clustering trees

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Decision trees for hierarchical multilabel classification: a case study in functional genomics

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Constraint based induction of multi-objective regression trees

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Clustering Trees with Instance Level Constraints

ECML '07 Proceedings of the 18th European conference on Machine Learning
Ensembles of Multi-Objective Decision Trees

ECML '07 Proceedings of the 18th European conference on Machine Learning
Non-redundant subgroup discovery in large and complex data

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much research on inductive databases (IDBs) focuses on local models, such as item sets and association rules. In this work, we investigate how IDBs can support global models, such as decision trees. Our focus is on predictive clustering trees (PCTs). PCTs generalize decision trees and can be used for prediction and clustering, two of the most common data mining tasks. Regular PCT induction builds PCTs topdown, using a greedy algorithm, similar to that of C4.5. We propose a new induction algorithm for PCTs based on beam search. This has three advantages over the regular method: (a) it returns a set of PCTs satisfying the user constraints instead of just one PCT; (b) it better allows for pushing of user constraints into the induction algorithm; and (c) it is less susceptible to myopia. In addition, we propose similarity constraints for PCTs, which improve the diversity of the resulting PCT set.