Finding cross genome patterns in annotation graphs

  • Authors:
  • Joseph Benik;Caren Chang;Louiqa Raschid;Maria-Esther Vidal;Guillermo Palma;Andreas Thor

  • Affiliations:
  • University of Maryland;University of Maryland;University of Maryland;Universidad Simón Bolívar, Venezuela;Universidad Simón Bolívar, Venezuela;University of Leipzig, Germany

  • Venue:
  • DILS'12 Proceedings of the 8th international conference on Data Integration in the Life Sciences
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Annotation graph datasets are a natural representation of scientific knowledge. They are common in the life sciences where concepts such as genes and proteins are annotated with controlled vocabulary terms from ontologies. Scientists are interested in analyzing or mining these annotations, in synergy with the literature, to discover patterns. Further, annotated datasets provide an avenue for scientists to explore shared annotations across genomes to support cross genome discovery. We present a tool, PAnG (Patterns in Annotation Graphs), that is based on a complementary methodology of graph summarization and dense subgraphs. The elements of a graph summary correspond to a pattern and its visualization can provide an explanation of the underlying knowledge. We present and analyze two distance metrics to identify related concepts in ontologies. We present preliminary results using groups of Arabidopsis and C. elegans genes to illustrate the potential benefits of cross genome pattern discovery.