Segmentation conditional random fields (SCRFs): a new approach for protein fold recognition

  • Authors:
  • Yan Liu;Jaime Carbonell;Peter Weigele;Vanathi Gopalakrishnan

  • Affiliations:
  • School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;Biology Department, Massachusetts Institute of Technology, Cambridge, MA;Center for Biomedical Informatics, University of Pittsburgh, PA

  • Venue:
  • RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein fold recognition is an important step towards understanding protein three-dimensional structures and their functions. A conditional graphical model, i.e. segmentation conditional random fields (SCRFs), is proposed to solve the problem. In contrast to traditional graphical models such as hidden markov model (HMM), SCRFs follow a discriminative approach. It has the flexibility to include overlapping or long-range interaction features over the whole sequence, as well as global optimally solutions for the parameters. On the other hand, the segmentation setting in SCRFs makes its graphical structures intuitively similar to the protein 3-D structures and more importantly, provides a framework to model the long-range interactions directly. Our model is applied to predict the parallel β-helix fold, an important fold in bacterial infection of plants and binding of antigens. The cross-family validation shows that SCRFs not only can score all known β-helices higher than non β-helices in Protein Data Bank, but also demonstrate more success in locating each rung in the known β-helix proteins than BetaWrap, a state-of-the-art algorithm for predicting β-helix fold, and HMMER, a general motif detection algorithm based on HMM. Applying our prediction model to Uniprot database, we hypothesize previously unknown β-helices.