Feature selection strategies for poorly correlated data: correlation coefficient considered harmful

Authors:
Silang Luo;David Corne
Affiliations:
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, United Kingdom;School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, United Kingdom
Venue:
AIKED'08 Proceedings of the 7th WSEAS International Conference on Artificial intelligence, knowledge engineering and data bases
Year:
2008

Citing 10
Cited 1

Adaptation in natural and artificial systems

Adaptation in natural and artificial systems
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Feature Selection: Evaluation, Application, and Small Sample Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Intelligence through simulated evolution: forty years of evolutionary programming

Intelligence through simulated evolution: forty years of evolutionary programming
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Numerical Optimization of Computer Models

Numerical Optimization of Computer Models
Processing large-scale multi-dimensional data in parallel and distributed environments

Parallel Computing - Parallel data-intensive algorithms and applications
An introduction to variable and feature selection

The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification

The Journal of Machine Learning Research

A General Framework of Feature Selection for Text Categorization

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature selection is often found to be an essential pre-processing step when data mining is applied to many-attribute datasets (e.g. several hundred or thousands of attributes). Feature selection aims to pre-select a relatively small number of attributes, thus speeding up further processing and (hopefully) eliminating data that have minimal or no discriminatory power. Often, feature selection is done on the basis of the straightforward statistical correlation, discarding features that have the lowest correlation with the target class(es). However, when these correlation values tend to be rather low for all features (common in many datasets of importance), the basis for pre-selection of any specific set of features is undermined, and straightforward feature selection may do more harm than good. We confirm this by investigating the performance of five feature selection strategies on several datasets with varying overall correlation values, finding that statistical correlation is never the best choice for poorly correlated data. The most reliable methods among those tested are either no feature selection, or Evolutionary Algorithm feature selection.