Genetic Programming for Classification: An Analysis of Convergence Behaviour

Authors:
Thomas Loveard;Victor Ciesielski
Affiliations:
-;-
Venue:
AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
2002

Citing 5
Cited 1

Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Genetic programming: an introduction: on the automatic evolution of computer programs and its applications

Genetic programming: an introduction: on the automatic evolution of computer programs and its applications
Genetic Programming for Feature Discovery and Image Discrimination

Proceedings of the 5th International Conference on Genetic Algorithms
Using Factorial Experiments to Evaluate the Effect of Genetic Programming Parameters

Proceedings of the European Conference on Genetic Programming
General Schema Theory for Genetic Programming with Subtree-Swapping Crossover

EuroGP '01 Proceedings of the 4th European Conference on Genetic Programming

Genetic programming with meta-search: searching for a successful population within the classification domain

EuroGP'03 Proceedings of the 6th European conference on Genetic programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates the unexpected convergence behaviour of genetic Programming (GP) for classification problems. Firstly the paper investigates the relationship between computational effort and attainable classification accuracy. Secondly we attempt to understand why GP classifiers sometimes fail to reach satisfactory levels of accuracy for certain problems regardless of computational effort. The investigation uses an artificially generated dataset for which certain properties are known in advance for the exploration of these areas.Results from this artificial problem show that by increasing computational effort, in the form of larger population sizes and more generations, the probability of success for a run does improve, but that the computational cost far outweighs the rate of this success. Also, some runs, even with very large populations running for many generations, became stagnant and were unable to find an acceptable solution. These results are also reflected in real world classification problems.From analysis of sub-tree components making up successful and unsuccessful programs it was noted that a small number of particular components were almost always present in successful programs, and that these components were often absent from unsuccessful programs. Also a variety of components appeared in unsuccessful programs that were never present in successful ones. Evidence from runs suggests that these components represent paths leading to optimal and sub-optimal branches in the evolutionary search space. Additionally, results suggest that if suboptimal components (which mirror the concept of deception in genetic algorithms) are relatively greater in number than the optimal components for the problem, then the chances of GP finding a successful solution are reduced.