Prediction by support vector machines and analysis by Z-score of poly-l-proline type II conformation based on local sequence

  • Authors:
  • Ming-Lei Wang;Hui Yao;Wen-Bo Xu

  • Affiliations:
  • Laboratory of Bioinformatics, The Key Laboratory of Industrial Biotechnology, Ministry of Education, Southern Yangtze University, Wuxi 214036, China and School of Biotechnology, Southern Yangtze U ...;School of Information Technology, Southern Yangtze University, Wuxi 214036, China;School of Information Technology, Southern Yangtze University, Wuxi 214036, China

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, the poly-l-proline type II (PPII) conformation has gained more and more importance. This structure plays vital roles in many biological processes. But few studies have been made to predict PPII secondary structures computationally. The support vector machine (SVM) represents a new approach to supervised pattern classification and has been successfully applied to a wide range of pattern recognition problems. In this paper, we present a SVM prediction method of PPII conformation based on local sequence. The overall accuracy for both the independent testing set and estimate of jackknife testing reached approximately 70%. Matthew's correlation coefficient (MCC) could reach 0.4. By comparing the results of training and testing datasets with different sequence identities, we suggest that the performance of this method correlates with the sequence identity of dataset. The parameter of SVM kernel function was an important factor to the performance of this method. The propensities of residues located at different positions were also analyzed. By computing Z-scores, we found that P and G were the two most important residues to PPII structure conformation.