Instability of decision tree classification algorithms

Authors:
Ruey-Hsia Li;Geneva G. Belford
Affiliations:
Lightspeed Semiconductor, Sunnyvale, CA;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2002

Citing 6
Cited 7

Bagging predictors

Machine Learning
On the Accuracy of Meta-learning for Scalable Data Mining

Journal of Intelligent Information Systems
BOAT—optimistic decision tree construction

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Induction of Decision Trees

Machine Learning
Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Instability of Decision Tree Classification Algorithms

Instability of Decision Tree Classification Algorithms

A co-evolving decision tree classification method

Expert Systems with Applications: An International Journal
Maximizing classifier utility when there are data acquisition and modeling costs

Data Mining and Knowledge Discovery
On exploiting the power of time in data mining

ACM SIGKDD Explorations Newsletter
New results on minimum error entropy decision trees

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Timely and continuous machine-learning-based classification for interactive IP traffic

IEEE/ACM Transactions on Networking (TON)
Decision trees: a recent overview

Artificial Intelligence Review
A hybrid decision tree classifier

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The instability problem of decision tree classification algorithms is that small changes in input training samples may cause dramatically large changes in output classification rules. Different rules generated from almost the same training samples are against human intuition and complicate the process of decision making. In this paper, we present fundamental theorems for the instability problem of decision tree classifiers. The first theorem gives the relationship between a data change and the resulting tree structure change (i.e. split change). The second theorem, Instability Theorem, provides the cause of the instability problem. Based on the two theorems, algorithmic improvements can be made to lessen the instability problem. Empirical results illustrate the theorem statements. The trees constructed by the proposed algorithm are more stable, noise-tolerant, informative, expressive, and concise. Our proposed sensitivity measure can be used as a metric to evaluate the stability of splitting predicates. The tree sensitivity is an indicator of the confidence level in rules and the effective lifetime of rules.