A bayesian approach for determining protein side-chain rotamer conformations using unassigned NOE data

  • Authors:
  • Jianyang Zeng;Kyle E. Roberts;Pei Zhou;Bruce R. Donald

  • Affiliations:
  • Department of Computer Science, Duke University, Durham, NC;Program in Computational Biology and Bioinformatics, Duke University, Durham NC;Department of Biochemistry, Duke University Medical Center, Durham, NC;Department of Computer Science, Duke University, Durham, NC and Program in Computational Biology and Bioinformatics, Duke University, Durham NC and Department of Biochemistry, Duke University Medi ...

  • Venue:
  • RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A major bottleneck in protein structure determination via nuclear magnetic resonance (NMR) is the lengthy and laborious process of assigning resonances and nuclear Overhauser effect (NOE) cross peaks. Recent studies have shown that accurate backbone folds can be determined using sparse NMR data, such as residual dipolar couplings (RDCs) or backbone chemical shifts. This opens a question of whether we can also determine the accurate protein sidechain conformations using sparse or unassigned NMR data. We attack this question by using unassigned nuclear Overhauser effect spectroscopy (NOESY) data, which record the through-space dipolar interactions between protons nearby in 3D space. We propose a Bayesian approach with a Markov random field (MRF) model to integrate the likelihood function derived from observed experimental data, with prior information (i.e., empirical molecular mechanics energies) about the protein structures. We unify the side-chain structure prediction problem with the side-chain structure determination problem using unassigned NMR data, and apply the deterministic dead-end elimination (DEE) and A* search algorithms to provably find the global optimum solution that maximizes the posterior probability. We employ a Hausdorff-based measure to derive the likelihood of a rotamer or a pairwise rotamer interaction from unassigned NOESY data. In addition, we apply a systematic and rigorous approach to estimate the experimental noise in NMR data, which also determines the weighting factor of the data term in the scoring function that is derived from the Bayesian framework. We tested our approach on real NMR data of three proteins, including the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), and human ubiquitin. The promising results indicate that our approach can be applied in high-resolution protein structure determination. Since our approach does not require any NOE assignment, it can accelerate the NMR structure determination process.