The Biases of Decision Tree Pruning Strategies

Authors:
Tapio Elomaa
Affiliations:
-
Venue:
IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
Year:
1999

Citing 16
Cited 3

A theory of the learnable

Communications of the ACM
Learning decision trees from random examples needed for learning

Information and Computation
C4.5: programs for machine learning

C4.5: programs for machine learning
Overfitting Avoidance as Bias

Machine Learning
Noise-Tolerant Occam Algorithms and Their Applications to Learning Decision Trees

Machine Learning
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
Lower bounds on learning decision lists and trees

Information and Computation
Induction of Decision Trees

Machine Learning
Learning From Noisy Examples

Machine Learning
The Effects of Training Set Size on Decision Tree Complexity

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Process-Oriented Heuristic for Model Selection

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The lack of a priori distinctions between learning algorithms

Neural Computation
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research
Process-oriented estimation of generalization error

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Lookahead and pathology in decision tree induction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Intermediate decision trees

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Algebraic Characterizations of Small Classes of Boolean Functions

STACS '03 Proceedings of the 20th Annual Symposium on Theoretical Aspects of Computer Science
Multi-objective optimization for incremental decision tree learning

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Survey paper: Knowledge discovery in clinical decision support systems for pain management: A systematic review

Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Post pruning of decision trees has been a successful approach in many real-world experiments, but over all possible concepts it does not bring any inherent improvement to an algorithm's performance. This work explores how a PAC-proven decision tree learning algorithm fares in comparison with two variants of the normal top-down induction of decision trees. The algorithm does not prune its hypothesis per se, but it can be understood to do pre-pruning of the evolving tree. We study a backtracking search algorithm, called Rank, for learning rank-minimal decision trees. Our experiments follow closely those performed by Schaffer [20]. They confirm the main findings of Schaffer: in learning concepts with simple description pruning works, for concepts with a complex description and when all concepts are equally likely pruning is injurious, rather than beneficial, to the average performance of the greedy top-down induction of decision trees. Pre-pruning, as a gentler technique, settles in the average performance in the middle ground between not pruning at all and post pruning.