Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Reversible Jump MCMC Simulated Annealing for Neural Networks
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Segmentation conditional random fields (SCRFs): a new approach for protein fold recognition
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Hi-index | 0.00 |
Protein fold recognition is a crucial step in inferring biological structure and function. This paper focuses on machine learning methods for predicting quaternary structural folds, which consist of multiple protein chains that form chemical bonds among side chains to reach a structurally stable domain. The complexity associated with modeling the quaternary fold poses major theoretical and computational challenges to current machine learning methods. We propose methods to address these challenges and show how (1) domain knowledge is encoded and utilized to characterize structural properties using segmentation conditional graphical models; and (2) model complexity is handled through efficient inference algorithms. Our model follows a discriminative approach so that any informative features, such as those representative of overlapping or long-range interactions, can be used conveniently. The model is applied to predict two important quaternary folds, the triple β-spirals and double-barrel trimers. Cross-family validation shows that our method outperforms other state-of-the art algorithms.