Microarray Classification from Several Two-Gene Expression Comparisons

  • Authors:
  • Donald German;Bahman Afsari;Aik Choon Tan;Daniel Q. Naiman

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICMLA '08 Proceedings of the 2008 Seventh International Conference on Machine Learning and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

We describe our contribution to the ICMLA2008 “Automated Micro-Array Classification Challenge”. The design of our classifier is motivated by the special scenario encountered in molecular cancer classification based on the mRNA concentrations provided by gene microarray data. Our classifier is rank-based; it only depends on expression comparisons among selected pairs of genes. Such comparisons are invariant to most of the transformations involved in preprocessing and normalization. Every pair of genes determines a binary classifier - choose the class for which the observed ordering is most likely. Pairs are scored by maximizing accuracy. In our k-TSP (k-disjoint Top Scoring Pairs) classifier, k disjoint pairs of genes are learned from training data; the discriminant function is simply the difference in the number of votes for the two classes. This rule involves exactly 2k genes, is readily interpretable, and provides some state-of-the-art results in cancer diagnosis and prognosis for small values of k, even k=1.