A Tutorial on Support Vector Machines for Pattern Recognition

  • Authors:
  • Christopher J. C. Burges

  • Affiliations:
  • Bell Laboratories, Lucent Technologies. E-mail: burges@lucent.com

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 1998

Quantified Score

Hi-index 0.20

Visualization

Abstract

The tutorial starts with an overview of the concepts of VC dimensionand structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, anddiscuss when SVM solutions are unique and when they are global. We describehow support vector training can be practically implemented, and discuss indetail the kernel mapping technique which is used to construct SVMsolutions which are nonlinear in the data. We show how Support Vectormachines can have very large (even infinite) VC dimension by computing theVC dimension for homogeneous polynomial and Gaussian radial basis functionkernels. While very high VC dimension would normally bode ill forgeneralization performance, and while at present there exists no theorywhich shows that good generalization performance is guaranteed for SVMs,there are several arguments which support the observed high accuracy ofSVMs, which we review. Results of some experiments which were inspired bythese arguments are also presented. We give numerous examples and proofs ofmost of the key theorems. There is new material, and I hope that the readerwill find that even old material is cast in a fresh light.