Feature Subset Selection Using a Genetic Algorithm

  • Authors:
  • Jihoon Yang;Vasant G. Honavar

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Intelligent Systems
  • Year:
  • 1998

Quantified Score

Hi-index 0.03

Visualization

Abstract

In practical pattern-classification tasks such as medical diagnosis, a classification function learned through an inductive learning algorithm assigns a given input pattern to one of a finite set of classes. Typically, the representation of each input pattern consists of a vector of attribute, feature, or measurement values. The choice of features to represent the patterns affects several aspects of pattern classification, including accuracy, required learning time, necessary number of examples, and cost.In the automated design of pattern classifiers, these variables present us with the feature subset selection problem. This is the task of identifying and selecting a useful subset of pattern-representing features from a larger set of features. The features in the larger set have different associated measurement costs and risks, and some may be irrelevant or mutually redundant.A significant, practical example of such a scenario is the task of selecting a subset of clinical testsýeach with a different financial cost, diagnostic value, and associated riskýto be performed for medical diagnosis. Other instances of the feature subset selection problem arise in, for example, large-scale data-mining applications and power system control.Several approaches to feature subset selection exist; ours employs a genetic algorithm. The experiments we describe in this article demonstrate the effectiveness of our approach in the automated design of neural networks for pattern classification and knowledge discovery.