Genetic programming, validation sets, and parsimony pressure

  • Authors:
  • Christian Gagné;Marc Schoenauer;Marc Parizeau;Marco Tomassini

  • Affiliations:
  • Équipe TAO – INRIA Futurs, LRI Bat. 490, Université Paris Sud, Orsay, France;Équipe TAO – INRIA Futurs, LRI Bat. 490, Université Paris Sud, Orsay, France;Laboratoire de Vision et Systèmes Numériques (LVSN), Département de Génie Électrique et de Génie Informatique, Université Laval, Québec (QC), Canada;Information Systems Institute, Université de Lausanne, Dorigny, Switzerland

  • Venue:
  • EuroGP'06 Proceedings of the 9th European conference on Genetic Programming
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fitness functions based on test cases are very common in Genetic Programming (GP). This process can be assimilated to a learning task, with the inference of models from a limited number of samples. This paper is an investigation on two methods to improve generalization in GP-based learning: 1) the selection of the best-of-run individuals using a three data sets methodology, and 2) the application of parsimony pressure in order to reduce the complexity of the solutions. Results using GP in a binary classification setup show that while the accuracy on the test sets is preserved, with less variances compared to baseline results, the mean tree size obtained with the tested methods is significantly reduced.