Minimal kernel classifiers

  • Authors:
  • Glenn M. Fung;Olvi L. Mangasarian;Alexander J. Smola

  • Affiliations:
  • Computer Sciences Department, University of Wisconsin, 1210 West Dayton Street, Madison, WI;Computer Sciences Department, University of Wisconsin, 1210 West Dayton Street, Madison, WI;Australian National University, Department of Engineering and RSISE, Canberra, ACT 0200, Australia

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A finite concave minimization algorithm is proposed for constructing kernel classifiers that use a minimal number of data points both in generating and characterizing a classifier. The algorithm is theoretically justified on the basis of linear programming perturbation theory and a leave-one-out error bound as well as effective computational results on seven real world datasets. A nonlinear rectangular kernel is generated by systematically utilizing as few of the data as possible both in training and in characterizing a nonlinear separating surface. This can result in substantial reduction in kernel data-dependence (over 94% in six of the seven public datasets tested on) and with test set correctness equal to that obtained by using a conventional support vector machine classifier that depends on many more data points. This reduction in data dependence results in a much faster classifier that requires less storage. To eliminate data points, the proposed approach makes use of a novel loss function, the "pound" function (·)#, which is a linear combination of the 1-norm and the step function that measures both the magnitude and the presence of any error.