A SVM for GPCR Protein Prediction Using Pattern Discovery

  • Authors:
  • Francisco Nascimento Junior;Ing Ren Tsang;George D. C. Cavalcanti

  • Affiliations:
  • -;-;-

  • Venue:
  • HIS '08 Proceedings of the 2008 8th International Conference on Hybrid Intelligent Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Machines learning techniques have been applied in several different problems in bioinformatics. Similarly, pattern discovery algorithms have also been used to uncover hidden motifs in protein sequences, contributing greatly to the understanding of the problem of protein classification. G-protein coupled receptors (GPCRs) represent one of the largest protein families in Human Genome. Most of these receptors are major target for drug discovery and development. Therefore, they are of interest to the pharmaceutical industry. The technique used in this paper combine machine learning and pattern discovery methods to develop a protein prediction procedure in relation to its functional class, more specifically to predict GPCR protein class. Vilo[2]proposed an algorithm in order to extract pattern of regular expressions from known protein GPCR sequences and used them to predict coupling specificity of G protein coupled receptors to their G proteins. We analyze these patterns and combine them as features for feeding a SVM to predict the GPCR super class. We demonstrate the results using ROC curves, which are well-indicated to evaluate the performance of this kind of classifiers. The experiments, based on the GPCRDB database, also showed that we were able to find some novel GPCR sequences that were not described in the PROSITE database.