Defining locality as a problem difficulty measure in genetic programming

  • Authors:
  • Edgar Galván-López;James Mcdermott;Michael O'Neill;Anthony Brabazon

  • Affiliations:
  • Natural Computing Research and Applications Group, University College Dublin, Dublin, Ireland;Natural Computing Research and Applications Group, University College Dublin, Dublin, Ireland;Natural Computing Research and Applications Group, University College Dublin, Dublin, Ireland;Natural Computing Research and Applications Group, University College Dublin, Dublin, Ireland

  • Venue:
  • Genetic Programming and Evolvable Machines
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. A representation has high locality if most genotypic neighbours are mapped to phenotypic neighbours. Locality is seen as a key element in performing effective evolutionary search. It is believed that a representation that has high locality will perform better in evolutionary search and the contrary is true for a representation that has low locality. When locality was introduced, it was the genotype-phenotype mapping in bitstring-based Genetic Algorithms which was of interest; more recently, it has also been used to study the same mapping in Grammatical Evolution. To our knowledge, there are few explicit studies of locality in Genetic Programming (GP). The goal of this paper is to shed some light on locality in GP and use it as an indicator of problem difficulty. Strictly speaking, in GP the genotype and the phenotype are not distinct. We attempt to extend the standard quantitative definition of genotype-phenotype locality to the genotype-fitness mapping by considering three possible definitions. We consider the effects of these definitions in both continuous- and discrete-valued fitness functions. We compare three different GP representations (two of them induced by using different function sets and the other using a slightly different GP encoding) and six different mutation operators. Results indicate that one definition of locality is better in predicting performance.