A novel method for classifying subfamilies and sub-subfamilies of g-protein coupled receptors

  • Authors:
  • Majid Beigi;Andreas Zell

  • Affiliations:
  • Center for Bioinformatics Tübingen (ZBIT), University of Tübingen, Tübingen, Germany;Center for Bioinformatics Tübingen (ZBIT), University of Tübingen, Tübingen, Germany

  • Venue:
  • ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

G-protein coupled receptors (GPCRs) are a large superfamily of integral membrane proteins that transduce signals across the cell membrane. Because of that important property and other physiological roles undertaken by the GPCR family, they have been an important target of therapeutic drugs. The function of many GPCRs is not known and accurate classification of GPCRs can help us to predict their function. In this study we suggest a kernel based method to classify them at the subfamily and sub-subfamily level. To enhance the accuracy and sensitivity of classifiers at the sub-subfamily level that we were facing with a low number of sequences (imbalanced data), we used our new synthetic protein sequence oversampling (SPSO) algorithm and could gain an overall accuracy and Matthew's correlation coefficient (MCC) of 98.4 % and 0.98 for class A, nearly 100% and 1 for class B and 96.95% and 0.91 for class C, respectively, at the subfamily level and overall accuracy and MCC of 97.93% and 0.95 at the sub-subfamily level. The results shows that Our oversampling technique can be used for other applications of protein classification with the problem of imbalanced data.