A markov random field framework for protein side-chain resonance assignment

  • Authors:
  • Jianyang Zeng;Pei Zhou;Bruce Randall Donald

  • Affiliations:
  • Department of Computer Science, Duke University, Durham, NC;Department of Biochemistry, Duke University Medical Center, Durham, NC;Department of Computer Science, Duke University, Durham, NC

  • Venue:
  • RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nuclear magnetic resonance (NMR) spectroscopy plays a critical role in structural genomics, and serves as a primary tool for determining protein structures, dynamics and interactions in physiologically-relevant solution conditions The current speed of protein structure determination via NMR is limited by the lengthy time required in resonance assignment, which maps spectral peaks to specific atoms and residues in the primary sequence Although numerous algorithms have been developed to address the backbone resonance assignment problem [68,2,10,37,14,64,1,31,60], little work has been done to automate side-chain resonance assignment [43, 48, 5] Most previous attempts in assigning side-chain resonances depend on a set of NMR experiments that record through-bond interactions with side-chain protons for each residue Unfortunately, these NMR experiments have low sensitivity and limited performance on large proteins, which makes it difficult to obtain enough side-chain resonance assignments On the other hand, it is essential to obtain almost all of the side-chain resonance assignments as a prerequisite for high-resolution structure determination To overcome this deficiency, we present a novel side-chain resonance assignment algorithm based on alternative NMR experiments measuring through-space interactions between protons in the protein, which also provide crucial distance restraints and are normally required in high-resolution structure determination We cast the side-chain resonance assignment problem into a Markov Random Field (MRF) framework, and extend and apply combinatorial protein design algorithms to compute the optimal solution that best interprets the NMR data Our MRF framework captures the contact map information of the protein derived from NMR spectra, and exploits the structural information available from the backbone conformations determined by orientational restraints and a set of discretized side-chain conformations (i.e., rotamers) A Hausdorff-based computation is employed in the scoring function to evaluate the probability of side-chain resonance assignments to generate the observed NMR spectra The complexity of the assignment problem is first reduced by using a dead-end elimination (DEE) algorithm, which prunes side-chain resonance assignments that are provably not part of the optimal solution Then an A* search algorithm is used to find a set of optimal side-chain resonance assignments that best fit the NMR data We have tested our algorithm on NMR data for five proteins, including the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), human ubiquitin, the ubiquitin-binding zinc finger domain of the human Y-family DNA polymerase Eta (pol η UBZ), and the human Set2-Rpb1 interacting domain (hSRI) Our algorithm assigns resonances for more than 90% of the protons in the proteins, and achieves about 80% correct side-chain resonance assignments The final structures computed using distance restraints resulting from the set of assigned side-chain resonances have backbone RMSD 0.5−1.4 Å and all-heavy-atom RMSD 1.0−2.2 Å from the reference structures that were determined by X-ray crystallography or traditional NMR approaches These results demonstrate that our algorithm can be successfully applied to automate side-chain resonance assignment and high-quality protein structure determination Since our algorithm does not require any specific NMR experiments for measuring the through-bond interactions with side-chain protons, it can save a significant amount of both experimental cost and spectrometer time, and hence accelerate the NMR structure determination process.