Conquering the Needle-in-a-Haystack: How Correlated Input Variables Beneficially Alter the Fitness Landscape for Neural Networks

  • Authors:
  • Stephen D. Turner;Marylyn D. Ritchie;William S. Bush

  • Affiliations:
  • Center for Human Genetics Research, Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, USA;Center for Human Genetics Research, Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, USA;Center for Human Genetics Research, Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, USA

  • Venue:
  • EvoBIO '09 Proceedings of the 7th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Evolutionary algorithms such as genetic programming and grammatical evolution have been used for simultaneously optimizing network architecture, variable selection, and weights for artificial neural networks. Using an evolutionary algorithm to perform variable selection while searching for non-linear interactions is akin to searching for a needle in a haystack. There is, however, a considerable amount of correlation among variables in biological datasets, such as in microarray or genetic studies. Using the XOR problem, we show that correlation between non-functional and functional variables alters the variable selection fitness landscape by broadening the fitness peak over a wider range of potential input variables. Furthermore, when sub-optimal weights are used, local optima in the variable selection fitness landscape appear centered on each of the two functional variables. These attributes of the fitness landscape may supply building blocks for evolutionary search procedures, and may provide a rationale for conducting a local search for variable selection.