Graphical models of residue coupling in protein families

  • Authors:
  • John Thomas;Naren Ramakrishnan;Chris Bailey-Kellogg

  • Affiliations:
  • Dartmouth College, Hanover, NH;Virginia Tech, Blacksburg, VA;Dartmouth College, Hanover, NH

  • Venue:
  • Proceedings of the 5th international workshop on Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identifying residue coupling relationships within a protein family can provide important insights into the family's evolutionary record, and has significant applications in analyzing and optimizing sequence-structure-function relationships. We present the first algorithm to infer an undirected graphical model representing residue coupling in protein families. Such a model, which we call a residue coupling network, serves as a compact description of the joint amino acid distribution, focused on the independences among residues. This stands in contrast to current methods, which manipulate dense representations of co-variation and are focused on assessing dependence, which can conflate direct and indirect relationships. Our probabilistic model provides a sound basis for predictive (will this newly designed protein be folded and functional?), diagnostic (why is this protein not stable or functional?), and abductive reasoning (what if I attempt to graft features of one protein family onto another?). Further, our algorithm can readily incorporate, as priors, hypotheses regarding possible underlying mechanistic/energetic explanations for coupling. The resulting approach constitutes a powerful and discriminatory mechanism to identify residue coupling from protein sequences and structures. Analysis results on the G-protein coupled receptor (GPCR) and PDZ domain families demonstrate the ability of our approach to effectively uncover and exploit models of residue coupling.