Fold Recognition by Predicted Alignment Accuracy

Authors:
Jinbo Xu
Affiliations:
-
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2005

Citing 3
Cited 5

The nature of statistical learning theory

The nature of statistical learning theory
On the approximation of protein threading

Theoretical Computer Science - Special issue: Genome informatics
Decision Support System for the Evolutionary Classification of Protein Structures

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology

Sequence-based protein structure prediction using a reduced state-space hidden Markov model

Computers in Biology and Medicine
Mining sequential patterns for protein fold recognition

Journal of Biomedical Informatics
A 9-state hidden Markov model using protein secondary structure information for protein fold recognition

Computers in Biology and Medicine
Boosting Protein Threading Accuracy

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Improving the protein fold recognition accuracy of a reduced state-space hidden Markov model

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequence-template alignments are generated. The chosen template should have the best alignment with the target sequence since the three-dimensional structure of the target sequence is built on the sequence-template alignment. The traditional method for template selection is called Z-score, which uses a statistical test to rank all the sequence-template alignments and then chooses the first-ranked template for the sequence. However, the calculation of Z-score is time-consuming and not suitable for genome-scale structure prediction. Z-scores are also hard to interpret when the threading scoring function is the weighted sum of several energy items of different physical meanings. This paper presents a Support Vector Machine (SVM) regression approach to directly predict the alignment accuracy of a sequence-template alignment, which is used to rank all the templates for a specific target sequence. Experimental results on a large-scale benchmark demonstrate that SVM regression performs much better than the composition-corrected Z-score method. SVM regression also runs much faster than the Z-score method.