Graphical models of residue coupling in protein families

Authors:
John Thomas;Naren Ramakrishnan;Chris Bailey-Kellogg
Affiliations:
Dartmouth College, Hanover, NH;Virginia Tech, Blacksburg, VA;Dartmouth College, Hanover, NH
Venue:
Proceedings of the 5th international workshop on Bioinformatics
Year:
2005

Citing 6
Cited 3

Learning Markov networks: maximum bounded tree-width graphs

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Mining complex models from arbitrarily large databases in constant time

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments

Bioinformatics
PDZBase: a protein--protein interaction database for PDZ-domains

Bioinformatics
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Graphical Models of Residue Coupling in Protein Families

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Using physicochemical properties of amino acids to induce graphical models of residue couplings

Proceedings of the Tenth International Workshop on Data Mining in Bioinformatics
Hypergraph model of multi-residue interactions in proteins: sequentially–constrained partitioning algorithms for optimization of site-directed protein recombination

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Identifying residue coupling relationships within a protein family can provide important insights into the family's evolutionary record, and has significant applications in analyzing and optimizing sequence-structure-function relationships. We present the first algorithm to infer an undirected graphical model representing residue coupling in protein families. Such a model, which we call a residue coupling network, serves as a compact description of the joint amino acid distribution, focused on the independences among residues. This stands in contrast to current methods, which manipulate dense representations of co-variation and are focused on assessing dependence, which can conflate direct and indirect relationships. Our probabilistic model provides a sound basis for predictive (will this newly designed protein be folded and functional?), diagnostic (why is this protein not stable or functional?), and abductive reasoning (what if I attempt to graft features of one protein family onto another?). Further, our algorithm can readily incorporate, as priors, hypotheses regarding possible underlying mechanistic/energetic explanations for coupling. The resulting approach constitutes a powerful and discriminatory mechanism to identify residue coupling from protein sequences and structures. Analysis results on the G-protein coupled receptor (GPCR) and PDZ domain families demonstrate the ability of our approach to effectively uncover and exploit models of residue coupling.