Complex function sets improve symbolic discriminant analysis of microarray data

Authors:
David M. Reif;Bill C. White;Nancy Olsen;Thomas Aune;Jason H. Moore
Affiliations:
Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN;Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN;Department of Medicine, Vanderbilt University, Nashville, TN;Department of Medicine, Vanderbilt University, Nashville, TN;Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN
Venue:
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Year:
2003

Citing 5
Cited 2

Applied multivariate statistical analysis

Applied multivariate statistical analysis
Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Elements of machine learning

Elements of machine learning
Efficient and Accurate Parallel Genetic Algorithms

Efficient and Accurate Parallel Genetic Algorithms
Cross validation consistency for the assessment of genetic programming results in microarray studies

EvoWorkshops'03 Proceedings of the 2003 international conference on Applications of evolutionary computing

Mask functions for the symbolic modeling of epistasis using genetic programming

Proceedings of the 10th annual conference on Genetic and evolutionary computation
ATHENA optimization: the effect of initial parameter settings across different genetic models

EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Our ability to simultaneously measure the expression levels of thousands of genes in biological samples is providing important new opportunities for improving the diagnosis, prevention, and treatment of common diseases. However, new technologies such as DNA microarrays are generating new challenges for variable selection and statistical modeling. In response to these challenges, a genetic programming-based strategy called symbolic discriminant analysis (SDA) for the automatic selection of gene expression variables and mathematical functions for statistical modeling of clinical endpoints has been developed. The initial development and evaluation of SDA has focused on a function set consisting of only the four basic arithmetic operators. The goal of the present study is to evaluate whether adding more complex operators such as square root to the function set improves SDA modeling of microarray data. The results presented in this paper demonstrate that adding complex functions to the terminal set significantly improves SDA modeling by reducing model size and, in some cases, reducing classification error and runtime. We anticipate SDA will be an important new evolutionary computation tool to be added to the repertoire of methods for the analysis of microarray data.