Expert knowledge without the expert: integrated analysis of gene expression and literature to derive active functional contexts

  • Authors:
  • Robert Küffner;Katrin Fundel;Ralf Zimmer

  • Affiliations:
  • Department of Informatics, Ludwig-Maximilians-Universität München Amalienstrasse 17 80333 München, Germany;Department of Informatics, Ludwig-Maximilians-Universität München Amalienstrasse 17 80333 München, Germany;Department of Informatics, Ludwig-Maximilians-Universität München Amalienstrasse 17 80333 München, Germany

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: The interpretation of expression data without appropriate expert knowledge is difficult and usually limited to exploratory data analysis, such as clustering and detecting differentially regulated genes. However, comparing experimental results against manually compiled knowledge resources might limit or bias the perspective on the data. Thus, manual analysis by experts is required to obtain confident predictions about involved processes. Results: We present an algorithm to simultaneously derive interpretations of expression measurements together with biological hypotheses from biomedical publications. It identifies active functional contexts ('concepts'), i.e. gene clusters that exhibit both a significant gene expression as well as a coherent literature profile. Manual intervention by an expert in specifying prior knowledge is not required. The approach scales to realistic applications and does not rely on controlled vocabularies or pathway resources. We validated our algorithm by analyzing a current juvenile arthritis dataset. A number of gene clusters and accompanying literature topics are identified as an interpretation of the data that coincide well with the phenotype and biological processes known to be involved in the disease. We demonstrate that generated clusters are both more sensitive and more specific than Gene Ontology categories detected on the same data. The method allows for in-depth investigation of subsets of genes, the associated literature topics and publications. Availability: Supplementary data on clusters is available upon request. Contact: Robert.Kueffner@bio.ifi.lmu.de