Evolving model trees for mining data sets with continuous-valued classes

  • Authors:
  • Gavin Potgieter;Andries P. Engelbrecht

  • Affiliations:
  • Department of Computer Science, University of Pretoria, South Africa;Department of Computer Science, University of Pretoria, South Africa

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 12.05

Visualization

Abstract

This paper presents a genetic programming (GP) approach to extract symbolic rules from data sets with continuous-valued classes, called GPMCC. The GPMCC makes use of a genetic algorithm (GA) to evolve multi-variate non-linear models [Potgieter, G., & Engelbrecht, A. (2007). Genetic algorithms for the structural optimisation of learned polynomial expressions. Applied Mathematics and Computation] at the terminal nodes of the GP. Several mechanisms have been developed to optimise the GP, including a fragment pool of candidate non-linear models, k-means clustering of the training data to facilitate the use of stratified sampling methods, and specialized mutation and crossover operators to evolve structurally optimal and accurate models. It is shown that the GPMCC is insensitive to control parameter values. Experimental results show that the accuracy of the GPMCC is comparable to that of NeuroLinear and Cubist, while producing significantly less rules with less complex antecedents.