Model Selection: Beyond the Bayesian/Frequentist Divide

Authors:
Isabelle Guyon;Amir Saffari;Gideon Dror;Gavin Cawley
Affiliations:
-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2010

Citing 33
Cited 5

A theory of the learnable

Communications of the ACM
Consonant recognition by modular construction of large phonemic time-delay neural networks

Advances in neural information processing systems 1
A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A practical Bayesian framework for backpropagation networks

Neural Computation
Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension

Machine Learning - Special issue on computational learning theory
Bagging predictors

Machine Learning
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Random Forests

Machine Learning
Metric-Based Methods for Adaptive Model Selection and Regularization

Machine Learning
Efficient Pattern Recognition Using a New Transformation Distance

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Regularized principal manifolds

The Journal of Machine Learning Research
Pac-bayesian generalisation error bounds for gaussian process classification

The Journal of Machine Learning Research
Use of the zero norm with linear models and kernel methods

The Journal of Machine Learning Research
Boosting as a Regularized Path to a Maximum Margin Classifier

The Journal of Machine Learning Research
The Entire Regularization Path for the Support Vector Machine

The Journal of Machine Learning Research
Tutorial on Practical Prediction Theory for Classification

The Journal of Machine Learning Research
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Invariances in kernel methods: From samples to objects

Pattern Recognition Letters
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters

The Journal of Machine Learning Research
Compression-Based Averaging of Selective Naive Bayes Classifiers

The Journal of Machine Learning Research
Backpropagation applied to handwritten zip code recognition

Neural Computation
2008 Special Issue: Analysis of the IJCNN 2007 agnostic learning vs. prior knowledge challenge

Neural Networks
VC Theory of Large Margin Multi-Category Classifiers

The Journal of Machine Learning Research
A New Probabilistic Approach in Rank Regression with Optimal Bayesian Partitioning

The Journal of Machine Learning Research
An Information Criterion for Variable Selection in Support Vector Machines

The Journal of Machine Learning Research
Multi-class Discriminant Kernel Learning via Convex Programming

The Journal of Machine Learning Research
Bayesian Inference and Optimal Design for the Sparse Linear Model

The Journal of Machine Learning Research
PAC-Bayesian learning of linear classifiers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Particle Swarm Model Selection

The Journal of Machine Learning Research
Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination

The Journal of Machine Learning Research
On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation

The Journal of Machine Learning Research

On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation

The Journal of Machine Learning Research
Embedded feature selection for support vector machines: state-of-the-art and future challenges

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Nyström approximate model selection for LSSVM

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
CoNet: feature generation for multi-view semi-supervised learning with partially observed views

Proceedings of the 21st ACM international conference on Information and knowledge management
Model selection based product kernel learning for regression on graphs

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The principle of parsimony also known as "Ockham's razor" has inspired many theories of model selection. Yet such theories, all making arguments in favor of parsimony, are based on very different premises and have developed distinct methodologies to derive algorithms. We have organized challenges and edited a special issue of JMLR and several conference proceedings around the theme of model selection. In this editorial, we revisit the problem of avoiding overfitting in light of the latest results. We note the remarkable convergence of theories as different as Bayesian theory, Minimum Description Length, bias/variance tradeoff, Structural Risk Minimization, and regularization, in some approaches. We also present new and interesting examples of the complementarity of theories leading to hybrid algorithms, neither frequentist, nor Bayesian, or perhaps both frequentist and Bayesian!