Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities

  • Authors:
  • Peter Sollich

  • Affiliations:
  • Department of Mathematics, King's College London, Strand, London WC2R 2LS, UK. peter.sollich@kcl.ac.uk

  • Venue:
  • Machine Learning
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

I describe a framework for interpreting Support Vector Machines (SVMs) as maximum a posteriori (MAP) solutions to inference problems with Gaussian Process priors. This probabilistic interpretation can provide intuitive guidelines for choosing a ‘good’ SVM kernel. Beyond this, it allows Bayesian methods to be used for tackling two of the outstanding challenges in SVM classification: how to tune hyperparameters—the misclassification penalty C, and any parameters specifying the ernel—and how to obtain predictive class probabilities rather than the conventional deterministic class label predictions. Hyperparameters can be set by maximizing the evidence; I explain how the latter can be defined and properly normalized. Both analytical approximations and numerical methods (Monte Carlo chaining) for estimating the evidence are discussed. I also compare different methods of estimating class probabilities, ranging from simple evaluation at the MAP or at the posterior average to full averaging over the posterior. A simple toy application illustrates the various concepts and techniques.