Genetic programming: on the programming of computers by means of natural selection
Genetic programming: on the programming of computers by means of natural selection
Genetic programming: an introduction: on the automatic evolution of computer programs and its applications
Response Surface Methodology: Process and Product in Optimization Using Designed Experiments
Response Surface Methodology: Process and Product in Optimization Using Designed Experiments
Symbolic Regression In Design Of Experiments: A Case Study With Linearizing Transformations
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Using Factorial Experiments to Evaluate the Effect of Genetic Programming Parameters
Proceedings of the European Conference on Genetic Programming
Design and Analysis of Experiments
Design and Analysis of Experiments
Hi-index | 0.00 |
Symbolic regression based on Pareto Front GP is the key approach for generating high-performance parsimonious empirical models acceptable for industrial applications. The paper addresses the issue of finding the optimal parameter settings of Pareto Front GP which direct the simulated evolution toward simple models with acceptable prediction error. A generic methodology based on statistical design of experiments is proposed. It includes statistical determination of the number of replicates by half-width confidence intervals, determination of the significant inputs by fractional factorial design of experiments, approaching the optimum by steepest ascent/descent, and local exploration around the optimum by Box Behnken or by central composite design of experiments. The results from implementing the proposed methodology to a small-sized industrial data set show that the statistically significant factors for symbolic regression, based on Pareto Front GP, are the number of cascades, the number of generations, and the population size. A second order regression model with high R2 of 0.97 includes the three parameters and their optimal values have been defined. The optimal parameter settings were validated with a separate small sized industrial data set. The optimal settings are recommended for symbolic regression applications using data sets with up to 5 inputs and up to 50 data points.