Fast semi-supervised SVM classifiers using a priori metric information

  • Authors:
  • Volkan Vural;Glenn Fung;Jennifer G. Dy;Bharat Rao

  • Affiliations:
  • Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA;Computer Aided Diagnosis and Therapy, Siemens Medical Solutions, Malvern, PA, USA;Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA;Computer Aided Diagnosis and Therapy, Siemens Medical Solutions, Malvern, PA, USA

  • Venue:
  • Optimization Methods & Software - Mathematical programming in data mining and machine learning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a support vector machine-based (SVM) parametric optimization method for semi-supervised classification, called LIAM (for linear hyperplane classifier with a priori metric information). Our method takes advantage of similarity information to leverage the unlabelled data in training SVMs. In addition to the smoothness constraints in existing semi-supervised methods, LIAM incorporates local class similarity constraints, that we empirically show, improved the accuracies in the presence of a few labelled points. We present and discuss a general convex mathematical-programming-based formulation to solve the inductive semi-supervised problem; i.e. our proposed algorithm directly classifies test samples not present when training. This general formulation results in different variants depending on the choice of the norms that are used in the objective function. For example, when using the 1-norm the proposed formulation becomes a linear programming problem that has the advantage of generating sparse solutions depending on a minimal set of the original features (feature selection). On the other hand, one of the proposed formulations results in an unconstrained quadratic problem for which solutions can be obtained by solving a simple system of linear equations, resulting in a fast competitive alternative to state-of-the-art semi-supervised algorithms. Our experiments on public benchmarks indicate that LIAM is at least one order of magnitude faster and at least as or more accurate (in most of the cases) than other state-of-the-art semi-supervised classification methods.