Protein cellular localization prediction with Support Vector Machines and Decision Trees

  • Authors:
  • Ana Carolina Lorena;André C. P. L. F. de Carvalho

  • Affiliations:
  • Instituto de Ciências Matemáticas e de Computação (ICMC), Universidade de São Paulo (USP), CEP 13560-970, Cx. Postal 668, São Carlos, SP, Brazil;Instituto de Ciências Matemáticas e de Computação (ICMC), Universidade de São Paulo (USP), CEP 13560-970, Cx. Postal 668, São Carlos, SP, Brazil

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many cellular functions are carried out in specific compartments of the cell. The prediction of the cellular localization of a protein is thus related to its function identification. This paper uses two Machine Learning techniques, Support Vector Machines (SVMs) and Decision Trees, in the prediction of the localization of proteins from three categories of organisms: gram-positive and gram-negative bacteria and fungi. For all categories considered, the localization task has multiple classes, which correspond to the possible protein locations. Since SVMs are originally designed for the solution of two-class problems, this paper also investigates and compares several strategies to extend this technique to perform multiclass predictions.