HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees

  • Authors:
  • Watshara Shoombuatong;Sayamon Hongjaisee;Francis Barin;Jeerayut Chaijaruwanich;Tanawan Samleerat

  • Affiliations:
  • Department of Computer Science, Chiang Mai University, Thailand and Bioinformatics Research Laboratory, Chiang Mai University, Thailand;Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Thailand;Université François Rabelais, Inserm U966, 37032 Tours Cedex, France;Department of Computer Science, Chiang Mai University, Thailand and Bioinformatics Research Laboratory, Chiang Mai University, Thailand;Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Thailand

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The determination of HIV-1 coreceptor usage plays a major role in HIV treatment. Since Maraviroc has been used in a treatment for patients those exclusively harbor R5-tropic strains, the efficient performance of classifying HIV-1 coreceptor usage can help choose the most advantaged HIV treatment. In general, HIV-1 variants are classified as R5-tropic and X4-tropic or dual/mixed tropic based on their coreceptor usages. The classification of the coreceptor usage has been developed by using the various computational methods or genotypic algorithms based on V3 amino acid sequences. Most genotypic tools have been designed based on a data set of the HIV-1 subtype B that seemed to be reliable only for this subtype. However, the performance of these tools decreases in non-B subtypes. In this study, the support vector machine (SVM) method has been used to classify the HIV-1 coreceptor. To develop an efficient SVM classifier, we present a feature selector using the logistic model tree (LMT) method to select the most relevant positions from the V3 amino acid sequences. Our approach achieves as high as 97.8% accuracy, 97.7% specificity, and 97.9% sensitivity measured by ten-fold cross-validation on 273 sequences.