The impact of population size on code growth in GP: analysis and empirical validation

Authors:
Riccardo Poli;Nicholas Freitag McPhee;Leonardo Vanneschi
Affiliations:
University of Essex, Colchester, United Kngdm;University of Minnesota, Morris, Morris, MN, USA;University of Milano-Bicocca, Milan, Italy
Venue:
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Year:
2008

Citing 16
Cited 13

Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Foundations of genetic programming

Foundations of genetic programming
Complexity Compression and Evolution

Proceedings of the 6th International Conference on Genetic Algorithms
Convergence Rates For The Distribution Of Program Outputs

GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
A Schema Theory Analysis of the Evolution of Size in Genetic Programming with Linear Representations

EuroGP '01 Proceedings of the 4th European Conference on Genetic Programming
General Schema Theory for Genetic Programming with Subtree-Swapping Crossover

EuroGP '01 Proceedings of the 4th European Conference on Genetic Programming
General schema theory for genetic programming with subtree-swapping crossover: Part II

Evolutionary Computation
Dynamics of evolutionary robustness

Proceedings of the 8th annual conference on Genetic and evolutionary computation
A quantitative study of neutrality in GP boolean landscapes

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Effects of code growth and parsimony pressure on populations in genetic programming

Evolutionary Computation
Convergence of program fitness landscapes

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Dynamic maximum tree depth: a simple technique for avoiding bloat in tree-based GP

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
On the limiting distribution of program sizes in tree-based genetic programming

EuroGP'07 Proceedings of the 10th European conference on Genetic programming
A Field Guide to Genetic Programming

A Field Guide to Genetic Programming
The halting probability in von neumann architectures

EuroGP'06 Proceedings of the 9th European conference on Genetic Programming

Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories

Genetic Programming and Evolvable Machines
The Role of Population Size in Rate of Evolution in Genetic Programming

EuroGP '09 Proceedings of the 12th European Conference on Genetic Programming
Extending Operator Equalisation: Fitness Based Self Adaptive Length Distribution for Bloat Free GP

EuroGP '09 Proceedings of the 12th European Conference on Genetic Programming
Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Program optimization by random tree sampling

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Using Operator Equalisation for Prediction of Drug Toxicity with Genetic Programming

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Bloat control operators and diversity in genetic programming: A comparative study

Evolutionary Computation
Measuring bloat, overfitting and functional complexity in genetic programming

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Reassembling operator equalisation: a secret revealed

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Reassembling operator equalisation: a secret revealed

ACM SIGEVOlution
Bloat free genetic programming versus classification trees for identification of burned areas in satellite imagery

EvoApplicatons'10 Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I
Operator equalisation for bloat free genetic programming and a survey of bloat control methods

Genetic Programming and Evolvable Machines
Bloat free genetic programming: application to human oral bioavailability prediction

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The crossover bias theory for bloat [18] is a recent result which predicts that bloat is caused by the sampling of short, unfit programs. This theory is clear and simple, but it has some weaknesses: (1) it implicitly assumes that the population is large enough to allow sampling of all relevant program sizes (although it does explain what to expect in the many practical cases where this is not true, e.g., because the population is small); (2) it does not explain what is meant by its assumption that short programs are unfit. In this paper we discuss these weaknesses and propose a refined version of the crossover bias theory that clarifies the relationship between bloat and finite populations, and explains what features of the fitness landscape cause bloat to occur. The theory, in particular, predicts that smaller populations will bloat more slowly than larger ones. Additionally, the theory predicts that bloat will only be observed in problems where short programs are less fit than longer ones when looking at samples created by fitness-based importance sampling, i.e. samplings of the search space in which fitter programs have a higher probability of being sampled (e.g., the Metropolis-Hastings method). Experiments with two classical GP benchmarks fully corroborate the theory.