Fast semi-supervised SVM classifiers using a priori metric information

Authors:
Volkan Vural;Glenn Fung;Jennifer G. Dy;Bharat Rao
Affiliations:
Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA;Computer Aided Diagnosis and Therapy, Siemens Medical Solutions, Malvern, PA, USA;Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA;Computer Aided Diagnosis and Therapy, Siemens Medical Solutions, Malvern, PA, USA
Venue:
Optimization Methods & Software - Mathematical programming in data mining and machine learning
Year:
2008

Citing 9
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Semi-supervised support vector machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
Proximal support vector machine classifiers

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Selection via Concave Minimization and Support Vector Machines

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Beyond the point cloud: from transductive to semi-supervised learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Trading convexity for scalability

ICML '06 Proceedings of the 23rd international conference on Machine learning
On information regularization

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a support vector machine-based (SVM) parametric optimization method for semi-supervised classification, called LIAM (for linear hyperplane classifier with a priori metric information). Our method takes advantage of similarity information to leverage the unlabelled data in training SVMs. In addition to the smoothness constraints in existing semi-supervised methods, LIAM incorporates local class similarity constraints, that we empirically show, improved the accuracies in the presence of a few labelled points. We present and discuss a general convex mathematical-programming-based formulation to solve the inductive semi-supervised problem; i.e. our proposed algorithm directly classifies test samples not present when training. This general formulation results in different variants depending on the choice of the norms that are used in the objective function. For example, when using the 1-norm the proposed formulation becomes a linear programming problem that has the advantage of generating sparse solutions depending on a minimal set of the original features (feature selection). On the other hand, one of the proposed formulations results in an unconstrained quadratic problem for which solutions can be obtained by solving a simple system of linear equations, resulting in a fast competitive alternative to state-of-the-art semi-supervised algorithms. Our experiments on public benchmarks indicate that LIAM is at least one order of magnitude faster and at least as or more accurate (in most of the cases) than other state-of-the-art semi-supervised classification methods.