PAC-Bayesian Compression Bounds on the Prediction Error of Learning Algorithms for Classification

  • Authors:
  • Thore Graepel;Ralf Herbrich;John Shawe-Taylor

  • Affiliations:
  • Microsoft Research Cambridge, UK;Microsoft Research Cambridge, UK;School of Electronics and Computer Science, University of Southampton, UK

  • Venue:
  • Machine Learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider bounds on the prediction error of classification algorithms based on sample compression. We refine the notion of a compression scheme to distinguish permutation and repetition invariant and non-permutation and repetition invariant compression schemes leading to different prediction error bounds. Also, we extend known results on compression to the case of non-zero empirical risk.We provide bounds on the prediction error of classifiers returned by mistake-driven online learning algorithms by interpreting mistake bounds as bounds on the size of the respective compression scheme of the algorithm. This leads to a bound on the prediction error of perceptron solutions that depends on the margin a support vector machine would achieve on the same training sample.Furthermore, using the property of compression we derive bounds on the average prediction error of kernel classifiers in the PAC-Bayesian framework. These bounds assume a prior measure over the expansion coefficients in the data-dependent kernel expansion and bound the average prediction error uniformly over subsets of the space of expansion coefficients.