Building Decision Trees with Constraints

Authors:
Minos Garofalakis;Dongjoon Hyun;Rajeev Rastogi;Kyuseok Shim
Affiliations:
Bell Labs, Lucent Technologies, Murray Hill, NJ 07974, USA. minos@bell-labs.com;Korea Advanced Institute of Science and Technology and Advanced Information Technology Research Center, Taejon, Korea. hyundong@organ.kaist.ac.kr;Bell Labs, Lucent Technologies, Murray Hill, NJ 07974, USA. rastogi@bell-labs.com;Seoul National University and Advanced Information Technology Center, Seoul, Korea. shim@ee.snu.ac.kr
Venue:
Data Mining and Knowledge Discovery
Year:
2003

Citing 0
Cited 9

Mining optimal decision trees from itemset lattices

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Simplify Multi-valued Decision Trees

ISICA '08 Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining

Applied Soft Computing
Analysis of time series data with predictive clustering trees

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Integrating decision tree learning into inductive databases

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Beam search induction and similarity constraints for predictive clustering trees

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Towards a general framework for data mining

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Optimal constraint-based decision tree induction from itemset lattices

Data Mining and Knowledge Discovery
Constraint based induction of multi-objective regression trees

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification is an important problem in data mining. Given a database of records, each with a class label, a classifier generates a concise and meaningful description for each class that can be used to classify subsequent records. A number of popular classifiers construct decision trees to generate class models. Frequently, however, the constructed trees are complex with hundreds of nodes and thus difficult to comprehend, a fact that calls into question an often-cited benefit that decision trees are easy to interpret. In this paper, we address the problem of constructing “simple” decision trees with few nodes that are easy for humans to interpret. By permitting users to specify constraints on tree size or accuracy, and then building the “best” tree that satisfies the constraints, we ensure that the final tree is both easy to understand and has good accuracy. We develop novel branch-and-bound algorithms for pushing the constraints into the building phase of classifiers, and pruning early tree nodes that cannot possibly satisfy the constraints. Our experimental results with real-life and synthetic data sets demonstrate that significant performance speedups and reductions in the number of nodes expanded can be achieved as a result of incorporating knowledge of the constraints into the building step as opposed to applying the constraints after the entire tree is built.