An empirical study of functional complexity as an indicator of overfitting in genetic programming

  • Authors:
  • Leonardo Trujillo;Sara Silva;Pierrick Legrand;Leonardo Vanneschi

  • Affiliations:
  • Instituto Tecnológico de Tijuana, Tijuana, BC, México;INESC-ID Lisboa, KDBIO group, Lisbon, Portugal and CISUC, ECOS group, University of Coimbra, Portugal;IMB, Institut de Mathématiques de Bordeaux, UMR, CNRS, France and INRIA Bordeaux Sud-Ouest, France;INESC-ID Lisboa, KDBIO group, Lisbon, Portugal and Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Italy

  • Venue:
  • EuroGP'11 Proceedings of the 14th European conference on Genetic programming
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, it has been stated that the complexity of a solution is a good indicator of the amount of overfitting it incurs. However, measuring the complexity of a program, in Genetic Programming, is not a trivial task. In this paper, we study the functional complexity and how it relates with overfitting on symbolic regression problems. We consider two measures of complexity, Slope-based Functional Complexity, inspired by the concept of curvature, and Regularity-based Functional Complexity based on the concept of Hölderian regularity. In general, both complexity measures appear to be poor indicators of program overfitting. However, results suggest that Regularity-based Functional Complexity could provide a good indication of overfitting in extreme cases.