Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure

Authors:
Jiangning Song;Zheng Yuan;Hao Tan;Thomas Huber;Kevin Burrage
Affiliations:
-;-;-;-;-
Venue:
Bioinformatics
Year:
2007

Citing 0
Cited 5

Disulfide bonding pattern prediction using support vector machine with parameters tuned by multiple trajectory search

AIC'09 Proceedings of the 9th WSEAS international conference on Applied informatics and communications
Prediction of disulfide bonding pattern based on support vector machine with parameters tuned by multiple trajectory search

WSEAS Transactions on Computers
An experimental comparison of gene selection by Lasso and Dantzig selector for cancer classification

Computers in Biology and Medicine
Prediction of disulfide bonding pattern based on a support vector machine and multiple trajectory search

Information Sciences: an International Journal
Disulfide connectivity prediction based on structural information without a prior knowledge of the bonding state of cysteines

Computers in Biology and Medicine

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. Results: We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. Availability: The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide Contact: kb@maths.uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.