A discriminative model for identifying spatial cis-regulatory modules

  • Authors:
  • Eran Segal;Roded Sharan

  • Affiliations:
  • Stanford University, Stanford, CA;International Computer Science Institute, Berkeley, CA

  • Venue:
  • RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Transcriptional regulation is mediated by the coordinated binding of transcription factors to the upstream region of genes. In higher eukaryotes, the binding sites of cooperating transcription factors are organized into short sequence units, called cis-regulatory modules. In this paper we propose a method for identifying modules of transcription factor binding sites in a set of co-regulated genes, using only the raw sequence data as input. Our method is based on a novel probabilistic model that describes the mechanism of cis-regulation, including the binding sites of cooperating transcription factors, the organization of these binding sites into short sequence modules, and the regulation of a gene by its modules. We show that our method is successful in discovering planted modules in simulated data and known modules in yeast. More importantly, we applied our method to a large collection of human gene sets, and found 83 significant cis-regulatory modules, which included 36 known motifs and many novel ones. Thus, our results provide one of the first comprehensive compendiums of putative cis-regulatory modules in human.