Learning decision rules in noisy domains
Proceedings of Expert Systems '86, The 6Th Annual Technical Conference on Research and development in expert systems III
On estimating probabilities in tree pruning
EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
A Comparative Analysis of Methods for Pruning Decision Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Self bounding learning algorithms
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
International Journal of Human-Computer Studies - Special issue: 1969-1999, the 30th anniversary
Pessimistic decision tree pruning based Continuous-time
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Generalization Bounds for Decision Trees
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Experiments with an innovative tree pruning algorithm
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Laplace's law of succession and universal encoding
IEEE Transactions on Information Theory
Hi-index | 0.00 |
The decision tree classifier is a well-known methodology for classification. It is widely accepted that a fully grown tree is usually over-fit to the training data and thus should be pruned back. In this paper, we analyze the overtraining issue theoretically using an the k-norm risk estimation approach with Lidstone's Estimate. Our analysis allows the deeper understanding of decision tree classifiers, especially on how to estimate their misclassification rates using our equations. We propose a simple pruning algorithm based on our analysis and prove its superior properties, including its independence from validation and its efficiency.