Selecting different protein representations and classification algorithms in hierarchical protein function prediction

  • Authors:
  • Carlos N. Silla, Jr.;Alex A. Freitas

  • Affiliations:
  • School of Computing and Centre for Biomedical Informatics, University of Kent, Canterbury, UK;School of Computing and Centre for Biomedical Informatics, University of Kent, Canterbury, UK

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatically inferring the function of unknown proteins is a challenging task in proteomics. There are two major problems in the task of computational protein function prediction, which are the choice of the protein representation and the choice of the classification algorithm. There are several ways of extracting features from a protein, and the choice of the feature representation might be as important as the choice of the classification algorithm. These problems are aggravated in the case of hierarchical protein function prediction, where a hierarchy of classifiers is built and each of those classifiers' construction has to consider the aforementioned selection problems. In this paper we address these problem by employing three alternative selective hierarchical classification approaches: a selecting the best classifier given a fixed representation; b selecting the best representation given a fixed classifier; and c selecting the best classifier and representation simultaneously, in a synergistic fashion. The analysis of the results have shown that the selective representation approach is almost always ranked number 1 when compared against the different fixed representations and that the use of the selective classifier approach is not able to surpass using only the best classifier for the target problem.