Sample compression bounds for decision trees

Authors:
Mohak Shah
Affiliations:
Laval University, Quebec, QC, Canada
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 13
Cited 1

Occam's razor

Information Processing Letters
Self bounding learning algorithms

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Generalization in decision trees and DNF: does size matter?

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Microchoice bounds and self bounding learning algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Generalization Bounds for Decision Trees

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Rademacher and gaussian complexities: risk bounds and structural results

The Journal of Machine Learning Research
Tutorial on Practical Prediction Theory for Classification

The Journal of Machine Learning Research
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Process-Specific Information for Learning Electronic Negotiation Outcomes

Fundamenta Informaticae
Sample compression, margins and generalization: extensions to the set covering machine

Sample compression, margins and generalization: extensions to the set covering machine

Hierarchical linear support vector machine

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a formulation of the Decision Tree learning algorithm in the Compression settings and derive tight generalization error bounds. In particular, we propose Sample Compression and Occam's Razor bounds. We show how such bounds, unlike the VC dimension or Rademacher complexities based bounds, are more general and can also perform a margin-sparsity trade-off to obtain better classifers. Potentially, these risk bounds can also guide the model selection process and replace traditional pruning strategies.