A comparative study of saliency analysis and genetic algorithm for feature selection in support vector machines

Authors:
Francis Eng Hock Tay;Li Juan Cao
Affiliations:
Department of Mechanical Engineering, National University of Singapore,10 Kent Ridge Crescent, 119260, Singapore. E-mail:mpetayeh@nus.edu.sg;Department of Mechanical Engineering, National University of Singapore,10 Kent Ridge Crescent, 119260, Singapore. E-mail:mpetayeh@nus.edu.sg
Venue:
Intelligent Data Analysis
Year:
2001

Citing 10
Cited 7

A note on genetic algorithms for large-scale feature selection

Pattern Recognition Letters
Adaptation in natural and artificial systems

Adaptation in natural and artificial systems
Microsoft C/C++ 7: the complete reference

Microsoft C/C++ 7: the complete reference
The nature of statistical learning theory

The nature of statistical learning theory
Application of neural networks and genetic algorithms in the classification of endothelial cells

Pattern Recognition Letters - special issue on pattern recognition in practice V
On domain knowledge and feature selection using a support vector machine

Pattern Recognition Letters
Nearest neighbor classifier: simultaneous editing and feature selection

Pattern Recognition Letters - Special issue on pattern recognition in practice VI
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Predicting Time Series with Support Vector Machines

ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks
Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics)

Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics)

ϵ-Descending Support Vector Machines for Financial Time Series Forecasting

Neural Processing Letters
A Comparison of PCA and GA Selected Features for Cloud Field Classification

IBERAMIA 2002 Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence
Forecasting stock market movement direction with support vector machine

Computers and Operations Research
Dynamic support vector machines for non-stationary time series forecasting

Intelligent Data Analysis
Improved financial time series forecasting by combining Support Vector Machines with self-organizing feature map

Intelligent Data Analysis
ICA and GA feature extraction and selection for cloud classification

ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Feature selection in SVM based on the hybrid of enhanced genetic algorithm and mutual information

MDAI'06 Proceedings of the Third international conference on Modeling Decisions for Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, support vector machine (SVM) has been receiving increasing attention in the field of regression estimation due to its remarkable characteristics such as good generalization performance, the absence of local minima and sparse representation of the solution. However, within the SVMs framework, there are very few established approaches for identifying important features. Selecting significant features from all candidate features is the first step in regression estimation, and this procedure can improve the network performance, reduce the network complexity, and speed up the training of the network. This paper investigates the use of saliency analysis (SA) and genetic algorithm (GA) in SVMs for selecting important features in the context of regression estimation. The SA measures the importance of features by evaluating the sensitivity of the network output with respect to the feature input. The derivation of the sensitivity of the network output to the feature input in terms of the partial derivative in SVMs is presented, and a systematic approach to remove irrelevant features based on the sensitivity is developed. GA is an efficient search method based on the mechanics of natural selection and population genetics. A simple GA is used where all features are mapped into binary chromosomes with a bit "1" representing the inclusion of the feature and a bit of "0" representing the absence of the feature. The performances of SA and GA are tested using two simulated non-linear time series and five real financial time series. The experiments show that with the simulated data, GA and SA detect the same true feature set from the redundant feature set, and the method of SA is also insensitive to the kernel function selection. With the real financial data, GA and SA select different subsets of features. Both selected feature sets achieve higher generation performance in SVMs than that of the full feature set. In addition, the generation performance between the selected feature sets of GA and SA is similar. All the results demonstrate that that both SA and GA are effective in SVMs for identifying important features.