BayCis: a Bayesian hierarchical HMM for cis-regulatory module decoding in metazoan genomes

  • Authors:
  • Tien-Ho Lin;Pradipta Ray;Geir K. Sandve;Selen Uguroglu;Eric P. Xing

  • Affiliations:
  • School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;Dept of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway;Dept of Computer Science and Engineering, Sabanci University, Istanbul, Turkey;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The transcriptional regulatory sequences in metazoan genomes often consist of multiple cis-regulatory modules (CRMs). Each CRM contains locally enriched occurrences of binding sites (motifs) for a certain array of regulatory proteins, capable of integrating, amplifying or attenuating multiple regulatory signals via combinatorial interaction with these proteins. The architecture of CRM organizations is reminiscent of the grammatical rules underlying a natural language, and presents a particular challenge to computational motif and CRM identification in metazoan genomes. In this paper, we present BayCis, a Bayesian hierarchical HMM that attempts to capture the stochastic syntactic rules of CRM organization. Under the BayCis model, all candidate sites are evaluated based on a posterior probability measure that takes into consideration their similarity to known BSs, their contrasts against local genomic context, their first-order dependencies on upstream sequence elements, as well as priors reflecting general knowledge of CRM structure. We compare our approach to five existing methods for the discovery of CRMs, and demonstrate competitive or superior prediction results evaluated against experimentally based annotations on a comprehensive selection of Drosophila regulatory regions. The software, database and Supplementary Materials will be available at http://www.sailing.cs. cmu.edu/baycis.