A feature-based approach to modeling protein-DNA interactions

  • Authors:
  • Eilon Sharon;Eran Segal

  • Affiliations:
  • Department of Computer Science, Weizmann Institute of Science, Rehovot, Israel;Department of Computer Science, Weizmann Institute of Science, Rehovot, Israel

  • Venue:
  • RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. In many cases this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF-DNA interactions, based on Markov networks. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our models, and devise an algorithm for learning their structural features from binding site data. We evaluate our approach on synthetic data, and then apply it to binding site and ChIP-chip data from yeast. We reveal sequence features that are present in the binding specificities of yeast TFs, and show that FMMs explain the binding data significantly better than PSSMs.