SVM based feature selection: why are we using the dual?

Authors:
Guillermo L. Grinblat;Javier Izetta;Pablo M. Granitto
Affiliations:
French Argentine International Center for Information and Systems Sciences, France and UNR-CONICET, Rosario, Argentina;French Argentine International Center for Information and Systems Sciences, France and UNR-CONICET, Rosario, Argentina;French Argentine International Center for Information and Systems Sciences, France and UNR-CONICET, Rosario, Argentina
Venue:
IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
Year:
2010

Citing 11
Cited 1

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Variable selection using svm based criteria

The Journal of Machine Learning Research
Use of the zero norm with linear models and kernel methods

The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

The Journal of Machine Learning Research
Training a Support Vector Machine in the Primal

Neural Computation
MSVM-RFE

Bioinformatics

Feature words that classify problem sentence in scientific article

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most Support Vector Machines (SVM) implementations are based on solving the dual optimization problem. Of course, feature selection algorithms based on SVM are not different and, in particular, the most used method in the area, Guyon et al.'s Recursive Feature Elimination (SVM-RFE) is also based on the dual problem. However, this is just one of the options available to find a solution to the original SVM optimization problem. In this work we discuss some potential problems that arise when ranking features with the dual-based version of SVM-RFE and propose a primal-based version of this well-known method, PSVM-RFE. We show that our new method is able to produce a better detection of relevant features, in particular in situations involving non-linear decision boundaries. Using several artificial and real-world datasets we compare both versions of SVM-RFE, finding that PSVM-RFE is preferable in most situations.