Protein--protein interaction site prediction based on conditional random fields

  • Authors:
  • Ming-Hui Li;Lei Lin;Xiao-Long Wang;Tao Liu

  • Affiliations:
  • Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China

  • Venue:
  • Bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: We are motivated by the fast-growing number of protein structures in the Protein Data Bank with necessary information for prediction of protein--protein interaction sites to develop methods for identification of residues participating in protein--protein interactions. We would like to compare conditional random fields (CRFs)-based method with conventional classification-based methods that omit the relation between two labels of neighboring residues to show the advantages of CRFs-based method in predicting protein--protein interaction sites. Results: The prediction of protein--protein interaction sites is solved as a sequential labeling problem by applying CRFs with features including protein sequence profile and residue accessible surface area. The CRFs-based method can achieve a comparable performance with state-of-the-art methods, when 1276 nonredundant hetero-complex protein chains are used as training and test set. Experimental result shows that CRFs-based method is a powerful and robust protein--protein interaction site prediction method and can be used to guide biologists to make specific experiments on proteins. Availability: http://www.insun.hit.edu.cn/~mhli/site_CRFs/index.html Contact: mhli@insun.hit.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.