Metric-Based Methods for Adaptive Model Selection and Regularization

Authors:
Dale Schuurmans;Finnegan Southey
Affiliations:
Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. dale@cs.uwaterloo.ca;Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. fdjsouth@cs.uwaterloo.ca
Venue:
Machine Learning
Year:
2002

Citing 19
Cited 11

Robust Classifiers by Mixed Adaptation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Elements of information theory

Elements of information theory
Bayesian interpolation

Neural Computation
Overfitting Avoidance as Bias

Machine Learning
The nature of statistical learning theory

The nature of statistical learning theory
Learning from a mixture of labeled and unlabeled examples with parametric side information

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Bagging predictors

Machine Learning
A data-dependent skeleton estimate for learning

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Learning from Data: Concepts, Theory, and Methods

Learning from Data: Concepts, Theory, and Methods
Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
Characterizing the generalization performance of model selection strategies

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An Adaptive Regularization Criterion for Supervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A new metric-based approach to model selection

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter

IEEE Transactions on Information Theory - Part 2

Automatic Model Selection by Modelling the Distribution of Residuals

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Extensions to metric based model selection

The Journal of Machine Learning Research
Feature subset selection for learning preferences: a case study

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Stability of transductive regression algorithms

Proceedings of the 25th international conference on Machine learning
Discriminatively regularized least-squares classification

Pattern Recognition
A discriminative model for semi-supervised learning

Journal of the ACM (JACM)
Model Selection: Beyond the Bayesian/Frequentist Divide

The Journal of Machine Learning Research
Generalization error bounds using unlabeled data

COLT'05 Proceedings of the 18th annual conference on Learning Theory
A multi-view regularization method for semi-supervised learning

ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
DCPE co-training for classification

Neurocomputing
Local scaling heuristic-based regularization for pattern classification

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a metric structure on hypotheses by determining the discrepancy between their predictions across the distribution of unlabeled data. We show how this metric can be used to detect untrustworthy training error estimates, and devise novel model selection strategies that exhibit theoretical guarantees against over-fitting (while still avoiding under-fitting). We then extend the approach to derive a general training criterion for supervised learning—yielding an adaptive regularization method that uses unlabeled data to automatically set regularization parameters. This new criterion adjusts its regularization level to the specific set of training data received, and performs well on a variety of regression and conditional density estimation tasks. The only proviso for these methods is that sufficient unlabeled training data be available.