A quantitative study of learning and generalization in genetic programming

  • Authors:
  • Mauro Castelli;Luca Manzoni;Sara Silva;Leonardo Vanneschi

  • Affiliations:
  • Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Milan, Italy;Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Milan, Italy;INESC-ID Lisboa, KDBIO group, Lisbon, Portugal;Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Milan, Italy and INESC-ID Lisboa, KDBIO group, Lisbon, Portugal

  • Venue:
  • EuroGP'11 Proceedings of the 14th European conference on Genetic programming
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The relationship between generalization and solutions functional complexity in genetic programming (GP) has been recently investigated. Three main contributions are contained in this paper: (1) a new measure of functional complexity for GP solutions, called Graph Based Complexity (GBC) is defined and we show that it has a higher correlation with GP performance on out-of-sample data than another complexity measure introduced in a recent publication. (2) A new measure is presented, called Graph Based Learning Ability (GBLA). It is inspired by the GBC and its goal is to quantify the ability of GP to learn "difficult" training points; we show that GBLA is negatively correlated with the performance of GP on out-of-sample data. (3) Finally, we use the ideas that have inspired the definition of GBC and GBLA to define a new fitness function, whose suitability is empirically demonstrated. The experimental results reported in this paper have been obtained using three real-life multidimensional regression problems.