A Confidence Measure for Model Fitting with X-Ray Crystallography Data

  • Authors:
  • Yang Lei;Ramgopal R. Mettu

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003;Department of Computer Science, Tulane University, New Orleans, LA 70118

  • Venue:
  • Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Structure determination from X-ray crystallography requires numerous stages of iterative refinement between real and reciprocal space. Current methods that fit a model structure to X-ray data therefore utilize a refined experimental electron density map along with a scoring function that characterizes the fit of the density map to structure. Additional information (e.g., from an energy function or conformational statistics) may supplement this score. In this paper, we derive a novel confidence measure for fitting model fragments into X-ray crystallography data. Given any set of conformations under consideration (e.g., a set of sidechain rotamers, or backbone fragments), and a scoring function for those conformations (e.g., least squares fit of the associated model density maps), we give a general-purpose method for assessing the confidence of the best-fit model. For the commonly used least-squares measure of fit, our method analyzes the statistics of the matching scores and estimates the probability that the best-fit conformation is the correct underlying model. To our knowledge, ours is the first method for computing such a confidence measure. To demonstrate the practical utility of our method, we study the problem of sidechain placement and show that our confidence measure can be used to detect and correct incorrect conformational predictions. Over nine proteins with density maps of varying resolutions, the Pearson correlation between predictive accuracy (of least-squares fit) and our confidence measure is quite high, about .89. We show that our approach can guide the use of stereochemical restraints when confidence is low in predictions. We also propose a Bayesian data fusion scheme that integrates our confidence measure to weight the contributon of each source of data, which could potentially be used for combining experimental, modeling, and empirical data in automated structure determination.