From on-line to batch learning
COLT '89 Proceedings of the second annual workshop on Computational learning theory
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Generalization performance of support vector machines and other pattern classifiers
Advances in kernel methods
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
AI Game Programming Wisdom
On Prediction by Data Compression
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Learning with the Set Covering Machine
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Generalisation Error Bounds for Sparse Linear Classifiers
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Sparse bayesian learning and the relevance vector machine
The Journal of Machine Learning Research
Machine learning with data dependent hypothesis classes
The Journal of Machine Learning Research
The Journal of Machine Learning Research
A PAC-Bayesian margin bound for linear classifiers
IEEE Transactions on Information Theory
Support Vector Machinery for Infinite Ensemble Learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
We consider bounds on the prediction error of classification algorithms based on sample compression. We refine the notion of a compression scheme to distinguish permutation and repetition invariant and non-permutation and repetition invariant compression schemes leading to different prediction error bounds. Also, we extend known results on compression to the case of non-zero empirical risk.We provide bounds on the prediction error of classifiers returned by mistake-driven online learning algorithms by interpreting mistake bounds as bounds on the size of the respective compression scheme of the algorithm. This leads to a bound on the prediction error of perceptron solutions that depends on the margin a support vector machine would achieve on the same training sample.Furthermore, using the property of compression we derive bounds on the average prediction error of kernel classifiers in the PAC-Bayesian framework. These bounds assume a prior measure over the expansion coefficients in the data-dependent kernel expansion and bound the average prediction error uniformly over subsets of the space of expansion coefficients.