The nature of statistical learning theory
The nature of statistical learning theory
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach
Data Mining and Knowledge Discovery
Protein family classification and functional annotation
Computational Biology and Chemistry
Computational Biology and Chemistry
Computational Biology and Chemistry
Peptide programs: applying fragment programs to protein classification
Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
International Journal of Computational Intelligence in Bioinformatics and Systems Biology
Remote homology detection incorporating the context of physicochemical properties
Computers in Biology and Medicine
Hi-index | 0.00 |
Biopolymer sequence comparison to identify evolutionarily related proteins, or homologs, is one of the most common tasks in bioinformatics. Support vector machines (SVMs) represent a new approach to the problem in which statistical learning theory is employed to classify proteins into families, thus identifying homologous relationships. Current SVM approaches have been shown to outperform iterative profile methods, such as PSI-BLAST, for protein homology classification. In this study, we demonstrate that the utilization of a Bayesian alignment score, which accounts for the uncertainty of all possible alignments, in the SVM construction improves sensitivity compared to the traditional dynamic programming implementation over a benchmark dataset consisting of 54 unique protein families. The SVM-BALSA algorithms returns a higher area under the receiver operating characteristic (ROC) curves for 37 of the 54 families and achieves an improved overall performance curve at a significance level of 0.07.