A comparison of pruning criteria for probability trees

Authors:
Daan Fierens;Jan Ramon;Hendrik Blockeel;Maurice Bruynooghe
Affiliations:
Dept. of Computer Science, K.U. Leuven, Leuven, Belgium 3001;Dept. of Computer Science, K.U. Leuven, Leuven, Belgium 3001;Dept. of Computer Science, K.U. Leuven, Leuven, Belgium 3001;Dept. of Computer Science, K.U. Leuven, Leuven, Belgium 3001
Venue:
Machine Learning
Year:
2010

Citing 0
Cited 3

Rule learning for classification based on neighborhood covering reduction

Information Sciences: an International Journal
Learning compact markov logic networks with decision trees

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Simple decision forests for multi-relational classification

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Probability trees are decision trees that predict class probabilities rather than the most likely class. The pruning criterion used to learn a probability tree strongly influences the size of the tree and thereby also the quality of its probability estimates. While the effect of pruning criteria on classification accuracy is well-studied, only recently has there been more interest in the effect on probability estimates. Hence, it is currently unclear which pruning criteria for probability trees are preferable under which circumstances.In this paper we survey six of the most important pruning criteria for probability trees, and discuss their theoretical advantages and disadvantages. We also perform an extensive experimental study of the relative performance of these pruning criteria. The main conclusion is that overall a pruning criterion based on randomization tests performs best because it is most robust to extreme data characteristics (such as class skew or a high number of classes). We also identify and explain several shortcomings of the other pruning criteria.