An improved statistic for detecting over-represented gene ontology annotations in gene sets

  • Authors:
  • Steffen Grossmann;Sebastian Bauer;Peter N. Robinson;Martin Vingron

  • Affiliations:
  • Max Planck Institute for Molecular Genetics, Berlin, Germany;Max Planck Institute for Molecular Genetics, Berlin, Germany;Institute for Medical Genetics, Charité University Hospital, Humboldt University, Berlin, Germany;Max Planck Institute for Molecular Genetics, Berlin, Germany

  • Venue:
  • RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose an improved statistic for detecting over-represented Gene Ontology (GO) annotations in gene sets. While the current methods treats each term independently and hence ignores the structure of the GO hierarchy, our approach takes parent-child relationships into account. Over-representation of a term is measured with respect to the presence of its parental terms in the set. This resolves the problem that the standard approach tends to falsely detect an over-representation of more specific terms below terms known to be over-represented. To show this, we have generated gene sets in which single terms are artificially over-represented and compared the receiver operator characteristics of the two approaches on these sets. A comparison on a biological dataset further supports our method. Our approach comes at no additional computational complexity when compared to the standard approach. An implementation is available within the framework of the freely available Ontologizer application.