Ensemble classifier for protein fold pattern recognition

  • Authors:
  • Hong-Bin Shen;Kuo-Chen Chou

  • Affiliations:
  • Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University Shanghai 200030, China;Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University Shanghai 200030, China

  • Venue:
  • Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Prediction of protein folding patterns is one level deeper than that of protein structural classes, and hence is much more complicated and difficult. To deal with such a challenging problem, the ensemble classifier was introduced. It was formed by a set of basic classifiers, with each trained in different parameter systems, such as predicted secondary structure, hydrophobicity, van der Waals volume, polarity, polarizability, as well as different dimensions of pseudo-amino acid composition, which were extracted from a training dataset. The operation engine for the constituent individual classifiers was OET-KNN (optimized evidence-theoretic k-nearest neighbors) rule. Their outcomes were combined through a weighted voting to give a final determination for classifying a query protein. The recognition was to find the true fold among the 27 possible patterns. Results: The overall success rate thus obtained was 62% for a testing dataset where most of the proteins have Availability: The ensemble classifier, called PFP-Pred, is available as a web-server at http://202.120.37.186/bioinf/fold/PFP-Pred.htm for public usage. Contact: lifesci-sjtu@san.rr.com Supplementary information: Supplementary data are available on Bioinformatics online.