Sample complexity of linear learning machines with different restrictions over weights
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Gene expression rule discovery and multi-objective ROC analysis using a neural-genetic hybrid
International Journal of Data Mining and Bioinformatics
Comprehensible classification models: a position paper
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
A widely persisting interpretation of Occam's razor is that given two classifiers with the same training error, the simpler classifier is more likely to generalize better. Within a long-lasting debate in the machine learning community over Occam's razor, Domingos (Data Min. Knowl. Discov. 3:409---425, 1999) rejects this interpretation and proposes that model complexity is only a confounding factor usually correlated with the number of models from which the learner selects. It is thus hypothesized that the risk of overfitting (poor generalization) follows only from the number of model tests rather than the complexity of the selected model. We test this hypothesis on 30 UCI data sets using polynomial classification models. The results confirm Domingos' hypothesis on the 0.05 significance level and thus refutes the above interpretation of Occam's razor. Our experiments however also illustrate that decoupling the two factors (model complexity and number of model tests) is problematic.