Knowledge mining with genetic programming methods for variable selection in flavor design

Authors:
Katya Vladislavleva;Kalyan Veeramachaneni;Matt Burland;Jason Parcon;Una-May O'Reilly
Affiliations:
Antwerp University, Antwerp, Belgium;Massachusetts Institute of Technology, Cambridge, MA, USA;Givaudan Flavors Corporation, Cincinnati, OH, USA;Givaudan Flavors Corporation, Cincinnati, OH, USA;Massachusetts Institute of Technology, Cambridge, MA, USA
Venue:
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Year:
2010

Citing 6
Cited 4

Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Genetic Programming for Feature Detection and Image Segmentation

Selected Papers from AISB Workshop on Evolutionary Computing
Scaled Symbolic Regression

Genetic Programming and Evolvable Machines
Feature construction and dimension reduction using genetic programming

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Evolutionary optimization of flavors

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Coevolution of Fitness Predictors

IEEE Transactions on Evolutionary Computation

Separating the wheat from the chaff: on feature selection and feature importance in regression random forests and symbolic regression

Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Data mining using unguided symbolic regression on a blast furnace dataset

EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Macro-economic time series modeling and interaction networks

EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part II
Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization

Genetic Programming and Evolvable Machines

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel approach for knowledge mining from a sparse and repeated measures dataset. Genetic programming based symbolic regression is employed to generate multiple models that provide alternate explanations of the data. This set of models, called an ensemble, is generated for each of the repeated measures separately. These multiple ensembles are then utilized to generate information about, (a) which variables are important in each ensemble, (b) cluster the ensembles into different groups that have similar variables that drive their response variable, and (c) measure sensitivity of response with respect to the important variables. We apply our methodology to a sensory science dataset. The data contains hedonic evaluations (liking scores), assigned by a diverse set of human testers, for a small set of flavors composed from seven ingredients. Our approach: (1) identifies the important ingredients that drive the liking score of a panelist and (2) segments the panelists into groups that are driven by the same ingredient, and (3) enables flavor scientists to perform the sensitivity analysis of liking scores relative to changes in the levels of important ingredients.