Robust Classifiers by Mixed Adaptation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Elements of information theory
Elements of information theory
Neural Computation
Machine Learning
The nature of statistical learning theory
The nature of statistical learning theory
Learning from a mixture of labeled and unlabeled examples with parametric side information
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Machine Learning
A data-dependent skeleton estimate for learning
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Learning from Data: Concepts, Theory, and Methods
Learning from Data: Concepts, Theory, and Methods
Learning in Neural Networks: Theoretical Foundations
Learning in Neural Networks: Theoretical Foundations
Characterizing the generalization performance of model selection strategies
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An Adaptive Regularization Criterion for Supervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A new metric-based approach to model selection
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
IEEE Transactions on Information Theory - Part 2
Automatic Model Selection by Modelling the Distribution of Residuals
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Extensions to metric based model selection
The Journal of Machine Learning Research
Feature subset selection for learning preferences: a case study
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Stability of transductive regression algorithms
Proceedings of the 25th international conference on Machine learning
Discriminatively regularized least-squares classification
Pattern Recognition
A discriminative model for semi-supervised learning
Journal of the ACM (JACM)
Model Selection: Beyond the Bayesian/Frequentist Divide
The Journal of Machine Learning Research
Generalization error bounds using unlabeled data
COLT'05 Proceedings of the 18th annual conference on Learning Theory
A multi-view regularization method for semi-supervised learning
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
DCPE co-training for classification
Neurocomputing
Hi-index | 0.00 |
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a metric structure on hypotheses by determining the discrepancy between their predictions across the distribution of unlabeled data. We show how this metric can be used to detect untrustworthy training error estimates, and devise novel model selection strategies that exhibit theoretical guarantees against over-fitting (while still avoiding under-fitting). We then extend the approach to derive a general training criterion for supervised learning—yielding an adaptive regularization method that uses unlabeled data to automatically set regularization parameters. This new criterion adjusts its regularization level to the specific set of training data received, and performs well on a variety of regression and conditional density estimation tasks. The only proviso for these methods is that sufficient unlabeled training data be available.