On the Scalability of Genetic Algorithms to Very Large-Scale Feature Selection

  • Authors:
  • Andreas Moser;M. Narasimha Murthy

  • Affiliations:
  • -;-

  • Venue:
  • Real-World Applications of Evolutionary Computing, EvoWorkshops 2000: EvoIASP, EvoSCONDI, EvoTel, EvoSTIM, EvoROB, and EvoFlight
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature Selection is a very promising optimisation strategy for Pattern Recognition systems. But, as an NP-complete task, it is extremely dificult to carry out. Past studies therefore were rather limited in either the cardinality of the feature space or the number of patterns utilised to assess the feature subset performance. This study examines the scalability of Distributed Genetic Algorithms to very large-scale Feature Selection. As domain of application, a classification system for Optical Characters is chosen. The system is tailored to classify hand-written digits, involving 768 binary features. Due to the vastness of the investigated problem, this study forms a step into new realms in Feature Selection for classification. We present a set of customisations of GAs that provide for an application of known concepts to Feature Selection problems of practical interest. Some limitations of GAs in the domain of Feature Selection are unrevealed and improvements are suggested. A widely used strategy to accelerate the optimisation process, Training Set Sampling, was observed to fail in this domain of application. Experiments on unseen validation data suggest that Distributed GAs are capable of reducing the problemcomplexity significantly. The results show that the classification accuracy can be maintained while reducing the feature space cardinality by about 50%. Genetic Algorithms are demonstrated to scale well to very large-scale problems in Feature Selection.