Feature selection via sensitivity analysis of SVM probabilistic outputs

Authors:
Kai-Quan Shen;Chong-Jin Ong;Xiao-Ping Li;Einar P. Wilder-Smith
Affiliations:
BLK EA, #07-08, Department of Mechanical Engineering, National University of Singapore, Singapore, Singapore 117576;BLK EA, #07-08, Department of Mechanical Engineering, National University of Singapore, Singapore, Singapore 117576;BLK EA, #07-08, Department of Mechanical Engineering, National University of Singapore, Singapore, Singapore 117576;Neurology, National University Hospital, Singapore, Singapore 119074
Venue:
Machine Learning
Year:
2008

Citing 23
Cited 7

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
Support-Vector Networks

Machine Learning
Bagging predictors

Machine Learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Making large-scale support vector machine learning practical

Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Soft Margins for AdaBoost

Machine Learning
Random Forests

Machine Learning
Choosing Multiple Parameters for Support Vector Machines

Machine Learning
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Feature Selection via Concave Minimization and Support Vector Machines

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Bayesian trigonometric support vector classifier

Neural Computation
An introduction to variable and feature selection

The Journal of Machine Learning Research
Variable selection using svm based criteria

The Journal of Machine Learning Research
Feature selection algorithms for the generation of multiple classifier systems and their application to handwritten word recognition

Pattern Recognition Letters
Combined SVM-Based Feature Selection and Classification

Machine Learning
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Bounds on Error Expectation for Support Vector Machines

Neural Computation
Which is the best multiclass SVM method? an empirical study

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms

IEEE Transactions on Neural Networks
Bayesian support vector regression using a unified loss function

IEEE Transactions on Neural Networks

Feature selection for SVM via optimization of kernel polarization with Gaussian ARD kernels

Expert Systems with Applications: An International Journal
Feature selection for support vector regression using probabilistic prediction

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Brief communication: Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm

Computational Biology and Chemistry
Sleep-wake stages classification and sleep efficiency estimation using single-lead electrocardiogram

Expert Systems with Applications: An International Journal
Probabilistic outputs for twin support vector machines

Knowledge-Based Systems
Feature words that classify problem sentence in scientific article

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Supervised feature subset selection with ordinal optimization

Knowledge-Based Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

Feature selection is an important aspect of solving data-mining and machine-learning problems. This paper proposes a feature-selection method for the Support Vector Machine (SVM) learning. Like most feature-selection methods, the proposed method ranks all features in decreasing order of importance so that more relevant features can be identified. It uses a novel criterion based on the probabilistic outputs of SVM. This criterion, termed Feature-based Sensitivity of Posterior Probabilities (FSPP), evaluates the importance of a specific feature by computing the aggregate value, over the feature space, of the absolute difference of the probabilistic outputs of SVM with and without the feature. The exact form of this criterion is not easily computable and approximation is needed. Four approximations, FSPP1-FSPP4, are proposed for this purpose. The first two approximations evaluate the criterion by randomly permuting the values of the feature among samples of the training data. They differ in their choices of the mapping function from standard SVM output to its probabilistic output: FSPP1 uses a simple threshold function while FSPP2 uses a sigmoid function. The second two directly approximate the criterion but differ in the smoothness assumptions of criterion with respect to the features. The performance of these approximations, used in an overall feature-selection scheme, is then evaluated on various artificial problems and real-world problems, including datasets from the recent Neural Information Processing Systems (NIPS) feature selection competition. FSPP1-3 show good performance consistently with FSPP2 being the best overall by a slight margin. The performance of FSPP2 is competitive with some of the best performing feature-selection methods in the literature on the datasets that we have tested. Its associated computations are modest and hence it is suitable as a feature-selection method for SVM applications.