Gene selection criterion for discriminant microarray data analysis based on extreme value distributions

Authors:
Wentian Li;Ivo Grosse
Affiliations:
North Shore LIJ Research Institute, Manhasset, NY;Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
Venue:
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Year:
2003

Citing 0
Cited 7

Gene Selection for Multi-Class Prediction of Microarray Data

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
A rank sum test method for informative gene discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Handling gene redundancy in microarray data using Grey Relational Analysis

International Journal of Data Mining and Bioinformatics
On the chance accuracies of large collections of classifiers

Proceedings of the 25th international conference on Machine learning
Semantic similarity based feature extraction from microarray expression data

International Journal of Data Mining and Bioinformatics
Virtual gene: a gene selection algorithm for sample classification on microarray datasets

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Virtual gene: using correlations between genes to select informative genes on microarray datasets

Transactions on Computational Systems Biology II

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the logistic regression model, this gene selection can be accomplished by a comparison of the maximum likelihood of the model given the real data, L(D|M), and the expected maximum likelihood of the model given an ensemble of surrogate data, L(D0|M). Typically, the computational burden for obtaining L(D0|M) is immense, often exceeding the limits of available resources by orders of magnitude. Here, we propose an approach that circumvents such heavy computations by mapping the simulation problem to an extreme value problem, which can be easily solved by numerical simulation. We choose three classification problems from two publicly available microarray datasets to illustrate that approach.