Arbitrary norm support vector machines

Authors:
Kaizhu Huang;Danian Zheng;Irwin King;Michael R. Lyu
Affiliations:
-;-;-;-
Venue:
Neural Computation
Year:
2009

Citing 12
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
Efficient SVM Regression Training with SMO

Machine Learning
Ridge Regression Learning Algorithm in Dual Variables

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
Minimal kernel classifiers

The Journal of Machine Learning Research
Use of the zero norm with linear models and kernel methods

The Journal of Machine Learning Research
The Minimum Error Minimax Probability Machine

The Journal of Machine Learning Research
An efficient method for simplifying support vector machines

ICML '05 Proceedings of the 22nd international conference on Machine learning
Improvements to Platt's SMO Algorithm for SVM Classifier Design

Neural Computation
Input space versus feature space in kernel-based methods

IEEE Transactions on Neural Networks
Maxi–Min Margin Machine: Learning Large Margin Classifiers Locally and Globally

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support vector machines (SVM) are state-of-the-art classifiers. Typically L2-norm or L1-norm is adopted as a regularization term in SVMs, while other norm-based SVMs, for example, the L0-norm SVM or even the L∞-norm SVM, are rarely seen in the literature. The major reason is that L0-norm describes a discontinuous and nonconvex term, leading to a combinatorially NP-hard optimization problem. In this letter, motivated by Bayesian learning, we propose a novel framework that can implement arbitrary norm-based SVMs in polynomial time. One significant feature of this framework is that only a sequence of sequential minimal optimization problems needs to be solved, thus making it practical in many real applications. The proposed framework is important in the sense that Bayesian priors can be efficiently plugged into most learning methods without knowing the explicit form. Hence, this builds a connection between Bayesian learning and the kernel machines. We derive the theoretical framework, demonstrate how our approach works on the L0-norm SVM as a typical example, and perform a series of experiments to validate its advantages. Experimental results on nine benchmark data sets are very encouraging. The implemented L0-norm is competitive with or even better than the standard L2-norm SVM in terms of accuracy but with a reduced number of support vectors,-9.46% of the number on average. When compared with another sparse model, the relevance vector machine, our proposed algorithm also demonstrates better sparse properties with a training speed over seven times faster.