Gene selection criterion for discriminant microarray data analysis based on extreme value distributions

  • Authors:
  • Wentian Li;Ivo Grosse

  • Affiliations:
  • North Shore LIJ Research Institute, Manhasset, NY;Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

  • Venue:
  • RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the logistic regression model, this gene selection can be accomplished by a comparison of the maximum likelihood of the model given the real data, L(D|M), and the expected maximum likelihood of the model given an ensemble of surrogate data, L(D0|M). Typically, the computational burden for obtaining L(D0|M) is immense, often exceeding the limits of available resources by orders of magnitude. Here, we propose an approach that circumvents such heavy computations by mapping the simulation problem to an extreme value problem, which can be easily solved by numerical simulation. We choose three classification problems from two publicly available microarray datasets to illustrate that approach.