Improving the protein fold recognition accuracy of a reduced state-space hidden Markov model

Authors:
Christos Lampros;Costas Papaloukas;Kostas Exarchos;Dimitrios I. Fotiadis;Dimitrios Tsalikakis
Affiliations:
Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, PO Box 1186, GR 45110 Ioannina, Greece and Department of Medical Physics, Me ...;Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, PO Box 1186, GR 45110 Ioannina, Greece and Department of Biological Applicat ...;Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, PO Box 1186, GR 45110 Ioannina, Greece and Department of Medical Physics, Me ...;Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, PO Box 1186, GR 45110 Ioannina, Greece and Biomedical Research Institute - F ...;Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, PO Box 1186, GR 45110 Ioannina, Greece and Engineering Informatics and Telec ...
Venue:
Computers in Biology and Medicine
Year:
2009

Citing 3
Cited 7

Fold Recognition by Predicted Alignment Accuracy

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Calibrating E-values for hidden Markov models using reverse-sequence null models

Bioinformatics
Sequence-based protein structure prediction using a reduced state-space hidden Markov model

Computers in Biology and Medicine

The application of fusion of heterogeneous meta classifiers to enhance protein fold prediction accuracy

ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
A hybrid discriminative/generative approach to protein fold recognition

Neurocomputing
A hybrid approach to increase the performance of protein folding recognition using support vector machines

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
A novel approach to protein structure prediction using PCA based extreme learning machines and multiple kernels

ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
A novel approach to protein structure prediction using PCA or LDA based extreme learning machines

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
Protein fold recognition with a two-layer method based on SVM-SA, WP-NN and C4.5 TLM-SNC

International Journal of Data Mining and Bioinformatics
FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Fold recognition is a challenging field strongly associated with protein function determination, which is crucial for biologists and the pharmaceutical industry. Hidden Markov models (HMMs) have been widely used for this purpose. In this paper we demonstrate how the fold recognition performance of a recently introduced HMM with a reduced state-space topology can be improved. Our method employs an efficient architecture and a low complexity training algorithm based on likelihood maximization. The fold recognition performance of the model is further improved in two steps. In the first step we use a smaller model architecture based on the {E,H,L} alphabet instead of the DSSP secondary structure alphabet. In the second step secondary structure information (predicted or true) is additionally used in scoring the test set sequences. The Protein Data Bank and the annotation of the SCOP database are used for the training and evaluation of the proposed methodology. The results show that the fold recognition accuracy is substantially improved in both steps. Specifically, it is increased by 2.9% in the first step to 22%. In the second step it further increases and reaches up to 30% when predicted secondary structure information is additionally used and it increases even more and reaches up to 34.7% when we use the true secondary structure. The major advantage of the proposed improvements is that the fold recognition performance is substantially increased while the size of the model and the computational complexity of scoring are decreased.