Data integration and pattern-finding in biological sequence with TESS's annotation grammar and extraction language (AnGEL)

  • Authors:
  • Jonathan Schug;Max Mintz;Christian J. Stoeckert, Jr.

  • Affiliations:
  • Department of Genetics in the School of Medicine, University of Pennsylvania, Philadelphia, PA;Department of Computer and Information Science in the School of Engineering, University of Pennsylvania, Philadelphia, PA;Department of Genetics in the School of Medicine, University of Pennsylvania, Philadelphia, PA

  • Venue:
  • DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Decoding the functional elements in an organism's genome requires the integration of a wide variety of experimental and computational data from a wide range of sources. The location of this data, viewed as sequence features in the genome, must serve as one of the essential organizing principles for this integration. It is therefore important to have a data integration system that takes advantage of this fact. As part of the TESS project, we have developed a grammar-based data integration and pattern search tool, Annotation Grammar and Extraction Language (AnGEL), that follows this principle. AnGEL can represent most of the current work in cis-regulatory module (CRM) modelling in an intuitive way and can process data extracted from a variety of sources simultaneously. Here we describe AnGEL's capabilities and illustrate its use by querying for gene arrangements, CRMs, and protein domain structure.