Information Processing Letters
Inferring decision trees using the minimum description length principle
Information and Computation
C4.5: programs for machine learning
C4.5: programs for machine learning
The Role of Occam‘s Razor in Knowledge Discovery
Data Mining and Knowledge Discovery
What Should be Minimized in a Decision Tre: A Re-examination
What Should be Minimized in a Decision Tre: A Re-examination
Lookahead-based algorithms for anytime induction of decision trees
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Further experimental evidence against the utility of Occam's razor
Journal of Artificial Intelligence Research
Anytime Learning of Decision Trees
The Journal of Machine Learning Research
Anytime induction of low-cost, low-error classifiers: a sampling-based approach
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Occam's razor is the principle that, given two hypotheses consistent with the observed data, the simpler one should be preferred. Many machine learning algorithms follow this principle and search for a small hypothesis within the version space. The principle has been the subject of a heated debate with theoretical and empirical arguments both for and against it. Earlier empirical studies lacked sufficient coverage to resolve the debate. In this work we provide convincing empirical evidence for Occam's razor in the context of decision tree induction. By applying a variety of sophisticated sampling techniques, our methodology samples the version space for many real-world domains and tests the correlation between the size of a tree and its accuracy. We show that indeed a smaller tree is likely to be more accurate, and that this correlation is statistically significant across most domains.