A Theoretical Analysis of Gene Selection

  • Authors:
  • Sach Mukherjee;Stephen J. Roberts

  • Affiliations:
  • University of Oxford;University of Oxford

  • Venue:
  • CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A great deal of recent research has focused on the challenging task of selecting differentially expressed genes from microarray data (ýgene selectioný). Numerous gene selection algorithms have been proposed in the literature, but it is often unclear exactly how these algorithms respond to conditions like small sample-sizes or differing variances. Choosing an appropriate algorithm can therefore be difficult in many cases. In this paper we propose a theoretical analysis of gene selection, in which the probability of successfully selecting relevant genes, using a given gene ranking function, is explicitly calculated in terms of population parameters. The theory developed is applicable to any ranking function which has a known sampling distribution, or one which can be approximated analytically. In contrast to empirical methods, the analysis can easily be used to examine the behaviour of gene selection algorithms under a wide variety of conditions, even when the numbers of genes involved runs into the tens of thousands. The utility of our approach is illustrated by comparing three well-known gene ranking functions.