The influence of search mechanisms in feature subset selection processes

Authors:
Maria do Carmo Nicoletti;Daniel M. Santoro
Affiliations:
Computer Science Department, Universidade Federal de S. Carlos, Carlos, SP, Brazil;Computer Science Department, Universidade Federal de S. Carlos, Carlos, SP, Brazil
Venue:
Intelligent Decision Technologies
Year:
2008

Citing 7
Cited 0

Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Feature Subset Selection Using a Genetic Algorithm

IEEE Intelligent Systems
Overfitting in making comparisons between variable selection methods

The Journal of Machine Learning Research
Genetic programming for simultaneous feature selection and classifier design

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The features that describe the training instances are crucial for the success of a machine learning (ML) algorithm. A training set described by redundant or irrelevant features, for instance, can mislead the ML algorithm into learning a poor expression of the real concept embedded in the data. Feature subset selection (FSS) processes invest in identifying and removing as much irrelevant and redundant information as possible. FSS processes generally conduct a heuristic search in the search space defined by all possible subsets of the initial feature set trying to identify the most relevant for the learning task. This paper describes an empirical investigation of the influence of the search mechanism in identifying a suitable feature subset, by comparatively evaluating five search methods, namely Hill-Climbing, Beam-Search, Random-Bit-Climber, Las Vegas search and Genetic Algorithm, combined with different strategies for initiating the process. Experiments were conducted using eleven knowledge domains and eleven different combinations of search-method/strategy. Results are presented and comparatively analyzed.