Boosting Methods for Protein Fold Recognition: An Empirical Comparison

  • Authors:
  • Yazhene Krishnaraj;Chandan K. Reddy

  • Affiliations:
  • -;-

  • Venue:
  • BIBM '08 Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein fold recognition is the prediction of protein's tertiary structure (Fold) given the protein's sequence without relying on sequence similarity. Using machine learning techniques for protein fold recognition, most of the state-of-the-art research has focused on more traditional algorithms such as Support Vector Machines (SVM), K-Nearest Neighbor (KNN) and Neural Networks (NN). In this paper, we present an empirical study of two variants of Boosting algorithms - AdaBoost and LogitBoost for the problem of fold recognition. Prediction accuracy is measured on a dataset with proteins from 27 most populated folds from the SCOP database, and is compared with results from other literature using SVM, KNN and NN algorithms on the same dataset. Overall, Boosting methods achieve 60\%\ fold recognition accuracy on an independent test protein dataset which is the highest prediction achieved when compared with the accuracy values obtained with other methods proposed in the literature. Boosting algorithms have the potential to build efficient classification models in a very fast manner.