A feature-based approach to modeling protein-DNA interactions

Authors:
Eilon Sharon;Eran Segal
Affiliations:
Department of Computer Science, Weizmann Institute of Science, Rehovot, Israel;Department of Computer Science, Weizmann Institute of Science, Rehovot, Israel
Venue:
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Year:
2007

Citing 4
Cited 1

Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling dependencies in protein-DNA binding sites

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Grafting: fast, incremental feature selection by gradient descent in function space

The Journal of Machine Learning Research
Feature selection, L1 vs. L2 regularization, and rotational invariance

ICML '04 Proceedings of the twenty-first international conference on Machine learning

Efficient learning of Bayesian network classifiers: an extension to the TAN classifier

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. In many cases this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF-DNA interactions, based on Markov networks. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our models, and devise an algorithm for learning their structural features from binding site data. We evaluate our approach on synthetic data, and then apply it to binding site and ChIP-chip data from yeast. We reveal sequence features that are present in the binding specificities of yeast TFs, and show that FMMs explain the binding data significantly better than PSSMs.