Integrating Multiple-Platform Expression Data through Gene Set Features

  • Authors:
  • Matěj Holec;Filip Železný;Jiří Kléma;Jakub Tolar

  • Affiliations:
  • Czech Technical University, Prague,;Czech Technical University, Prague,;Czech Technical University, Prague,;University of Minnesota, Minneapolis,

  • Venue:
  • ISBRA '09 Proceedings of the 5th International Symposium on Bioinformatics Research and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We demonstrate a set-level approach to the integration of multiple platform gene expression data for predictive classification and show its utility for boosting classification performance when single- platform samples are rare. We explore three ways of defining gene sets, including a novel way based on the notion of a fully coupled flux related to metabolic pathways. In two tissue classification tasks, we empirically show that the gene set based approach is useful for combining heterogeneous expression data, while surprisingly, in experiments constrained to a single platform, biologically meaningful gene sets acting as sample features are often outperformed by random gene sets with no biological relevance.