Overfitting detection and adaptive covariant parsimony pressure for symbolic regression

Authors:
Gabriel Kronberger;Michael Kommenda;Michael Affenzeller
Affiliations:
Upper Austria University of Applied Sciences, Hagenberg, Austria;Upper Austria University of Applied Sciences, Hagenberg, Austria;Upper Austria University of Applied Sciences, Hagenberg, Austria
Venue:
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Year:
2011

Citing 11
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Scaled Symbolic Regression

Genetic Programming and Evolvable Machines
Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Using enhanced genetic programming techniques for evolving classifiers in the context of medical diagnosis

Genetic Programming and Evolvable Machines
Extending Operator Equalisation: Fitness Based Self Adaptive Length Distribution for Bloat Free GP

EuroGP '09 Proceedings of the 12th European Conference on Genetic Programming
Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Order of nonlinearity as a complexity measure for models generated by symbolic regression via Pareto genetic programming

IEEE Transactions on Evolutionary Computation
Measuring bloat, overfitting and functional complexity in genetic programming

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Abstract functions and lifetime learning in genetic programming for symbolic regression

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Genetic programming, validation sets, and parsimony pressure

EuroGP'06 Proceedings of the 9th European conference on Genetic Programming
Two fast tree-creation algorithms for genetic programming

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Covariant parsimony pressure is a theoretically motivated method primarily aimed to control bloat. In this contribution we describe an adaptive method to control covariant parsimony pressure that is aimed to reduce overfitting in symbolic regression. The method is based on the assumption that overfitting can be reduced by controlling the evolution of program length. Additionally, we propose an overfitting detection criterion that is based on the correlation of the fitness values on the training set and a validation set of all models in the population. The proposed method uses covariant parsimony pressure to decrease the average program length when overfitting occurs and allows an increase of the average program length in the absence of overfitting. The proposed approach is applied on two real world datasets. The experimental results show that the correlation of training and validation fitness can be used as an indicator for overfitting and that the proposed method of covariant parsimony pressure adaption alleviates overfitting in symbolic regression experiments with the two datasets.