Evolutionary generation of neural network classifiers-An empirical comparison

  • Authors:
  • M. Castellani

  • Affiliations:
  • Theoretical Ecology Group, Department of Biology, University of Bergen, Postboks 7803, 5020 Bergen, Norway

  • Venue:
  • Neurocomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

Most methods for the evolutionary generation of multi-layer perceptron classifiers use a divide-and-conquer strategy, where the tasks of feature selection, structure design, and weight training are performed separately. The concurrent evolution of the whole classifier has been seldom attempted and its effectiveness has never been exhaustively benchmarked. This paper presents an experimental study on the merits of this latter approach. Two schemes were investigated. The first method evolves simultaneously the neural network structure and input feature vector, and trains via a standard learning procedure the candidate solutions (wrapper approach). The second method evolves simultaneously the whole classifier (embedded approach). The performance of these two algorithms was compared to that of two manual and two automatic neural network optimisation techniques on thirteen well-known pattern recognition benchmarks. The experimental results revealed the specific strengths and weaknesses of the six algorithms. Overall, the evolutionary embedded method obtained good results in terms of classification accuracy and compactness of the solutions. The tests indicated that the outcome of the feature selection task has a major impact on the accuracy and compactness of the solutions. Evolutionary algorithms perform best on feature spaces of small and medium size, and were the most effective at rejecting redundant features. Classical filter-based algorithms based on feature correlation are preferable on undersampled data sets. Correlation- and saliency-based selection was the most effective method in the presence of a large number of irrelevant features. The applicability and performance of the wrapper algorithm was severely limited by the computational costs of the approach.