Symbolic Discriminant Analysis for Mining Gene Expression Patterns

  • Authors:
  • Jason H. Moore;Joel S. Parker;Lance W. Hahn

  • Affiliations:
  • -;-;-

  • Venue:
  • EMCL '01 Proceedings of the 12th European Conference on Machine Learning
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

New laboratory technologies have made it possible to measure the expression levels of thousands of genes simultaneously in a particular cell or tissue. The challenge for computational biologists will be to develop methods that are able to identify subsets of gene expression variables that classify cells and tissues into meaningful clinical groups. Linear discriminant analysis is a popular multivariate statistical approach for classification of observations into groups. This is because the theory is well described and the method is easy to implement and interpret. However, an important limitation is that linear discriminant functions need to be pre-specified. To address this limitation and the limitation of linearity, we developed symbolic discriminant analysis (SDA) for the automatic selection of gene expression variables and discriminant functions that can take any form. We have implemented the genetic programming machine learning methodology for optimizing SDA in parallel on a Beowulf-style computer cluster.