Minimizing and learning energy functions for side-chain prediction

Authors:
Chen Yanover;Ora Schueler-Furman;Yair Weiss
Affiliations:
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel;Department of Molecular Genetics and Biotechnology, Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem, Israel;School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
Venue:
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Year:
2007

Citing 8
Cited 3

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Kernel conditional random fields: representation and clique selection

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Solving and analyzing side-chain positioning problems using linear and integer programming

Bioinformatics
Globally Optimal Solutions for Energy Minimization in Stereo Vision Using Reweighted Belief Propagation

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Accelerated training of conditional random fields with stochastic gradient methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Linear Programming Relaxations and Belief Propagation -- An Empirical Study

The Journal of Machine Learning Research
MAP estimation via agreement on trees: message-passing and linear programming

IEEE Transactions on Information Theory

Graphical Models, Exponential Families, and Variational Inference

Foundations and Trends® in Machine Learning
A Bayesian approach to protein model quality assessment

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Rapid and accurate protein side chain prediction with local backbone information

RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Side-chain prediction is an important subproblem of the general protein folding problem. Despite much progress in side-chain prediction, performance is far from satisfactory. As an example, the ROSETTA program that uses simulated annealing to select the minimum energy conformations, correctly predicts the first two side-chain angles for approximately 72% of the buried residues in a standard data set. Is further improvement more likely to come from better search methods, or from better energy functions? Given that exact minimization of the energy is NP hard, it is difficult to get a systematic answer to this question. In this paper, we present a novel search method and a novel method for learning energy functions from training data that are both based on Tree Reweighted Belief Propagation (TRBP). We find that TRBP can find the global optimum of the ROSETTA energy function in a few minutes of computation for approximately 85% of the proteins in a standard benchmark set. TRBP can also effectively bound the partition function which enables using the Conditional Random Fields (CRF) framework for learning. Interestingly, finding the global minimum does not significantly improve sidechain prediction for an energy function based on ROSETTA's default energy terms (less than 0.1%), while learning new weights gives a significant boost from 72% to 78%. Using a recently modified ROSETTA energy function with a softer Lennard-Jones repulsive term, the global optimum does improve prediction accuracy from 77% to 78%. Here again, learning new weights improves side-chain modeling even further to 80%. Finally, the highest accuracy (82.6%) is obtained using an extended rotamer library and CRF learned weights. Our results suggest that combining machine learning with approximate inference can improve the state-of-the-art in side-chain prediction.