An analysis of reduced error pruning

Authors:
Tapio Elomaa;Matti Kääriäinen
Affiliations:
Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland
Venue:
Journal of Artificial Intelligence Research
Year:
2001

Citing 23
Cited 9

Learning decision rules in noisy domains

Proceedings of Expert Systems '86, The 6Th Annual Technical Conference on Research and development in expert systems III
Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
A guided tour of Chernoff bounds

Information Processing Letters
Boolean Feature Discovery in Empirical Learning

Machine Learning
On estimating probabilities in tree pruning

EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Trading Accuracy for Simplicity in Decision Trees

Machine Learning
Randomized algorithms

Randomized algorithms
An efficient algorithm for optimal pruning of decision trees

Artificial Intelligence
Technical note: some properties of splitting criteria

Machine Learning
A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Predicting Nearly As Well As the Best Pruning of a Decision Tree

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
Toward a theoretical understanding of why and when decision tree pruning algorithms fail

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An Efficient Extension to Mixture Techniques for Prediction and Decision Trees

Machine Learning
Machine Learning

Machine Learning
An Empirical Comparison of Pruning Methods for Decision Tree Induction

Machine Learning
An Empirical Comparison of Selection Measures for Decision-Tree Induction

Machine Learning
Induction of Decision Trees

Machine Learning
Decision Tree Pruning as a Search in the State Space

ECML '93 Proceedings of the European Conference on Machine Learning
Pessimistic decision tree pruning based Continuous-time

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
The Effects of Training Set Size on Decision Tree Complexity

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Tail bounds for occupancy and the satisfiability threshold conjecture

SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science

Reduced error pruning of branching programs cannot be approximated to within a logarithmic factor

Information Processing Letters
The Difficulty of Reduced Error Pruning of Leveled Branching Programs

Annals of Mathematics and Artificial Intelligence
Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees

The Journal of Machine Learning Research
Experiments with an innovative tree pruning algorithm

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
A k-norm pruning algorithm for decision tree classifiers based on error rate estimation

Machine Learning
3DM: Domain-oriented Data-driven Data Mining

Fundamenta Informaticae - Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II)
Simplification methods for model trees with regression and splitting nodes

MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Using ensembles of regression trees to monitor lubricating oil quality

IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
3DM: Domain-oriented Data-driven Data Mining

Fundamenta Informaticae - Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Top-down induction of decision trees has been observed to suffer from the inadequate functioning of the pruning phase. In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the accuracy of the tree does not improve. Reduced Error Pruning is an algorithm that has been used as a representative technique in attempts to explain the problems of decision tree learning. In this paper we present analyses of Reduced Error Pruning in three different settings. First we study the basic algorithmic properties of the method, properties that hold independent of the input decision tree and pruning examples. Then we examine a situation that intuitively should lead to the subtree under consideration to be replaced by a leaf node, one in which the class label and attribute values of the pruning examples are independent of each other. This analysis is conducted under two different assumptions. The general analysis shows that the pruning probability of a node fitting pure noise is bounded by a function that decreases exponentially as the size of the tree grows. In a specific analysis we assume that the examples are distributed uniformly to the tree. This assumption lets us approximate the number of subtrees that are pruned because they do not receive any pruning examples. This paper clarifies the different variants of the Reduced Error Pruning algorithm, brings new insight to its algorithmic properties, analyses the algorithm with less imposed assumptions than before, and includes the previously overlooked empty subtrees to the analysis.