Biomarker discovery in microarray gene expression data with Gaussian processes

  • Authors:
  • Wei Chu;Zoubin Ghahramani;Francesco Falciani;David L. Wild

  • Affiliations:
  • Gatsby Computational Neuroscience Unit, University College London UK;Gatsby Computational Neuroscience Unit, University College London UK;School of Biosciences, University of Birmingham UK;Keck Graduate Institute of Applied Life Sciences Claremont, CA 91711, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: In clinical practice, pathological phenotypes are often labelled with ordinal scales rather than binary, e.g. the Gleason grading system for tumour cell differentiation. However, in the literature of microarray analysis, these ordinal labels have been rarely treated in a principled way. This paper describes a gene selection algorithm based on Gaussian processes to discover consistent gene expression patterns associated with ordinal clinical phenotypes. The technique of automatic relevance determination is applied to represent the significance level of the genes in a Bayesian inference framework. Results: The usefulness of the proposed algorithm for ordinal labels is demonstrated by the gene expression signature associated with the Gleason score for prostate cancer data. Our results demonstrate how multi-gene markers that may be initially developed with a diagnostic or prognostic application in mind are also useful as an investigative tool to reveal associations between specific molecular and cellular events and features of tumour physiology. Our algorithm can also be applied to microarray data with binary labels with results comparable to other methods in the literature. Availability:The source code was written in ANSI C, which is accessible at www.gatsby.ucl.ac.uk/~chuwei/code/gpgenes.tar Contact: wild@kgi.edu