A One-Class Classification Approach for Protein Sequences and Structures

  • Authors:
  • András Bánhalmi;Róbert Busa-Fekete;Balázs Kégl

  • Affiliations:
  • Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Szeged, Hungary H-6720;Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Szeged, Hungary H-6720 and LAL, University of Paris-Sud, CNRS, Orsay, France 91898;LAL, University of Paris-Sud, CNRS, Orsay, France 91898

  • Venue:
  • ISBRA '09 Proceedings of the 5th International Symposium on Bioinformatics Research and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The One-Class Classification (OCC) approach is based on the assumption that samples are available only from a target class in the training phase. OCC methods have been applied with success to problems where the classes are very different in size. As class-imbalance problems are typical in protein classification tasks, we were interested in testing one-class classification algorithms for the detection of distant similarities in protein sequences and structures. We found that the OCC approach brought about a small improvement in classification performance compared to binary classifiers (SVM, ANN, Random Forest). More importantly, there is a substantial (50 to 100 fold) improvement in the training time. OCCs may provide an especially useful alternative for processing those protein groups where discriminative classifiers cannot be easily trained.