Improving protein secondary structure prediction using a multi-modal BP method

  • Authors:
  • Wu Qu;Haifeng Sui;Bingru Yang;Wenbin Qian

  • Affiliations:
  • School of Information Engineering, University of Science and Technology Beijing, Beijing, China;School of Information Engineering, University of Science and Technology Beijing, Beijing, China;School of Information Engineering, University of Science and Technology Beijing, Beijing, China;School of Information Engineering, University of Science and Technology Beijing, Beijing, China

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Methods for predicting protein secondary structures provide information that is useful both in ab initio structure prediction and as additional restraints for fold recognition algorithms. Secondary structure predictions may also be used to guide the design of site directed mutagenesis studies, and to locate potential functionally important residues. In this article, we propose a multi-modal back propagation neural network (MMBP) method for predicting protein secondary structures. Using a Knowledge Discovery Theory based on Inner Cognitive Mechanism (KDTICM) method, we have constructed a compound pyramid model (CPM), which is composed of three layers of intelligent interface that integrate multi-modal back propagation neural network (MMBP), mixed-modal SVM (MMS), modified Knowledge Discovery in Databases (KDD^@?) process and so on. The CPM method is both an integrated web server and a standalone application that exploits recent advancements in knowledge discovery and machine learning to perform very accurate protein secondary structure predictions. Using a non-redundant test dataset of 256 proteins from RCASP256, the CPM method achieves an average Q"3 score of 86.13% (SOV99=84.66%). Extensive testing indicates that this is significantly better than any other method currently available. Assessments using RS126 and CB513 datasets indicate that the CPM method can achieve average Q"3 score approaching 83.99% (SOV99=80.25%) and 85.58% (SOV99=81.15%). By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called CPM, which performs these secondary structure predictions, is accessible at http://kdd.ustb.edu.cn/protein_Web/.