Learning Markov networks: maximum bounded tree-width graphs
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Mining complex models from arbitrarily large databases in constant time
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Information Theory, Inference & Learning Algorithms
Information Theory, Inference & Learning Algorithms
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Graphical Models of Residue Coupling in Protein Families
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Using physicochemical properties of amino acids to induce graphical models of residue couplings
Proceedings of the Tenth International Workshop on Data Mining in Bioinformatics
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Hi-index | 0.00 |
Identifying residue coupling relationships within a protein family can provide important insights into the family's evolutionary record, and has significant applications in analyzing and optimizing sequence-structure-function relationships. We present the first algorithm to infer an undirected graphical model representing residue coupling in protein families. Such a model, which we call a residue coupling network, serves as a compact description of the joint amino acid distribution, focused on the independences among residues. This stands in contrast to current methods, which manipulate dense representations of co-variation and are focused on assessing dependence, which can conflate direct and indirect relationships. Our probabilistic model provides a sound basis for predictive (will this newly designed protein be folded and functional?), diagnostic (why is this protein not stable or functional?), and abductive reasoning (what if I attempt to graft features of one protein family onto another?). Further, our algorithm can readily incorporate, as priors, hypotheses regarding possible underlying mechanistic/energetic explanations for coupling. The resulting approach constitutes a powerful and discriminatory mechanism to identify residue coupling from protein sequences and structures. Analysis results on the G-protein coupled receptor (GPCR) and PDZ domain families demonstrate the ability of our approach to effectively uncover and exploit models of residue coupling.