A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Atomic Decomposition by Basis Pursuit
SIAM Journal on Scientific Computing
Support Vector Machines and the Bayes Rule in Classification
Data Mining and Knowledge Discovery
Support Vector Machines for Classification in Nonstandard Situations
Machine Learning
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
Geometrical Properties of Nu Support Vector Machines with Different Norms
Neural Computation
Gene Selection for Cancer Classification Using DCA
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Optimized fixed-size kernel models for large data sets
Computational Statistics & Data Analysis
A smoothing function for 1-norm support vector machines
ICNC'09 Proceedings of the 5th international conference on Natural computation
Expert Systems with Applications: An International Journal
Evolution strategies based adaptive Lp LS-SVM
Information Sciences: an International Journal
Review: Supervised classification and mathematical optimization
Computers and Operations Research
Sparse high-dimensional fractional-norm support vector machine via DC programming
Computational Statistics & Data Analysis
Hi-index | 0.03 |
The standard support vector machine (SVM) minimizes the hinge loss function subject to the L"2 penalty or the roughness penalty. Recently, the L"1 SVM was suggested for variable selection by producing sparse solutions [Bradley, P., Mangasarian, O., 1998. Feature selection via concave minimization and support vector machines. In: Shavlik, J. (Ed.), ICML'98. Morgan Kaufmann, Los Altos, CA; Zhu, J., Hastie, T., Rosset, S., Tibshirani, R., 2003. 1-norm support vector machines. Neural Inform. Process. Systems 16]. These learning methods are non-adaptive since their penalty forms are pre-determined before looking at data, and they often perform well only in a certain type of situation. For instance, the L"2 SVM generally works well except when there are too many noise inputs, while the L"1 SVM is more preferred in the presence of many noise variables. In this article we propose and explore an adaptive learning procedure called the L"q SVM, where the best q0 is automatically chosen by data. Both two- and multi-class classification problems are considered. We show that the new adaptive approach combines the benefit of a class of non-adaptive procedures and gives the best performance of this class across a variety of situations. Moreover, we observe that the proposed L"q penalty is more robust to noise variables than the L"1 and L"2 penalties. An iterative algorithm is suggested to solve the L"q SVM efficiently. Simulations and real data applications support the effectiveness of the proposed procedure.