Communications of the ACM
Information and Computation
The nature of statistical learning theory
The nature of statistical learning theory
On the exponential value of labeled samples
Pattern Recognition Letters
Learning from a mixture of labeled and unlabeled examples with parametric side information
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Making large-scale support vector machine learning practical
Advances in kernel methods
Beating the hold-out: bounds for K-fold and progressive cross-validation
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Metric-Based Methods for Adaptive Model Selection and Regularization
Machine Learning
PAC-Bayesian Stochastic Model Selection
Machine Learning
Rademacher and gaussian complexities: risk bounds and structural results
The Journal of Machine Learning Research
Extensions to metric based model selection
The Journal of Machine Learning Research
Tutorial on Practical Prediction Theory for Classification
The Journal of Machine Learning Research
Almost-everywhere algorithmic stability and generalization error
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
A PAC-Style model for learning from labeled and unlabeled data
COLT'05 Proceedings of the 18th annual conference on Learning Theory
A discriminative model for semi-supervised learning
Journal of the ACM (JACM)
Selective sampling for classification
Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
A selective sampling strategy for label ranking
ECML'06 Proceedings of the 17th European conference on Machine Learning
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Learning with stochastic inputs and adversarial outputs
Journal of Computer and System Sciences
Supervised learning and Co-training
Theoretical Computer Science
Hi-index | 0.00 |
We present two new methods for obtaining generalization error bounds in a semi-supervised setting. Both methods are based on approximating the disagreement probability of pairs of classifiers using unlabeled data. The first method works in the realizable case. It suggests how the ERM principle can be refined using unlabeled data and has provable optimality guarantees when the number of unlabeled examples is large. Furthermore, the technique extends easily to cover active learning. A downside is that the method is of little use in practice due to its limitation to the realizable case. The idea in our second method is to use unlabeled data to transform bounds for randomized classifiers into bounds for simpler deterministic classifiers. As a concrete example of how the general method works in practice, we apply it to a bound based on cross-validation. The result is a semi-supervised bound for classifiers learned based on all the labeled data. The bound is easy to implement and apply and should be tight whenever cross-validation makes sense. Applying the bound to SVMs on the MNIST benchmark data set gives results that suggest that the bound may be tight enough to be useful in practice.