Prediction of Protein Secondary Structure with two-stage multi-class SVMs

Authors:
Minh N. Nguyen;Jagath C. Rajapakse
Affiliations:
BioInformatics Research Centre, School of Computer Engineering, Nanyang Technological University, Singapore.;Biological Engineering Division, Massachusetts Institute of Technology, USA/ Singapore-/MIT Alliance, N2-/B2C-/15, 50 Nanyang Avenue, Singapore 639798
Venue:
International Journal of Data Mining and Bioinformatics
Year:
2007

Citing 5
Cited 9

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
On the Learnability and Design of Output Codes for Multiclass Problems

Machine Learning
Two-stage support vector machines for protein secondary structure prediction

Neural, Parallel & Scientific Computations - Special issue: Advances in intelligent systems and applications
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Di-codon Usage for Gene Classification

PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Amino acid features for prediction of protein-protein interface residues with support vector machines

EvoBIO'07 Proceedings of the 5th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Training neural networks for protein secondary structure prediction: the effects of imbalanced data set

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Identification of true EST alignments for recognising transcribed regions

International Journal of Data Mining and Bioinformatics
MicroRNAfold: pre-microRNA secondary structure prediction based on modified NCM model with thermodynamics-based scoring strategy

International Journal of Data Mining and Bioinformatics
Support Vector Machines with L1 penalty for detecting gene-gene interactions

International Journal of Data Mining and Bioinformatics
Prediction of protein secondary structure using large margin nearest neighbour classification

International Journal of Bioinformatics Research and Applications
Predicting transmission of avian influenza A viruses from avian to human by using informative physicochemical properties

International Journal of Data Mining and Bioinformatics
A sampling approach for protein backbone fragment conformations

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bioinformatics techniques to Protein Secondary Structure (PSS)prediction mostly depend on the information available in amino acidsequences. In this paper, we propose a two-stage Multi-classSupport Vector Machine (MSVM) approach, where the second MSVMpredictor is introduced at the output of the first stage MSVM tocapture the contextual relationship among secondary structureelements in order to minimise the generalisation error in theprediction. By using position-specific scoring matrices generatedby PSI-BLAST, the two-stage MSVM approach achieves Q3accuracies of 78.0% and 76.3% on the RS126 dataset of 126non-homologous globular proteins and the CB396 dataset of 396non-homologous proteins, respectively, which are better than thescores reported on both datasets to date. By using MSVM, thepresent prediction scheme significantly achieves 2 6% and 3 15% ofimprovement in Q3 and Sov accuracies, respectively, onthe two datasets. On larger blind-test datasets from PSIPRED, CASP4and EVA datasets, two-stage MSVM approach achieves Q3accuracies from 77.0% to 79.5%.