Identifying significant features in HIV sequence to predict patients' response to therapies

Authors:
Samuel Evangelista de Lima Oliveira;Luiz Henrique de Campos Merschmann;Leoneide Erica Maduro Bouillet
Affiliations:
Federal University of Ouro Preto, Ouro Preto/MG - Brazil;Federal University of Ouro Preto, Ouro Preto/MG - Brazil;Federal University of Ouro Preto, Ouro Preto/MG - Brazil
Venue:
BSB'11 Proceedings of the 6th Brazilian conference on Advances in bioinformatics and computational biology
Year:
2011

Citing 7
Cited 0

Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Random Forests

Machine Learning
The Alternating Decision Tree Learning Algorithm

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Letters: MppS: An ensemble of support vector machine based on multiple physicochemical properties of amino acids

Neurocomputing
Selecting anti-HIV therapies based on a variety of genomic and clinical factors

Bioinformatics
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Human Immunodeficiency Virus (HIV) is a retrovirus that attacks the human immune system reducing its effectiveness. Combinations of antiretroviral drugs are used to treat the infection by HIV. However, the high mutation rate in the HIV virus makes it resistant to some antiretroviral drugs and leads to treatment failure. Nowadays, there are computational methods based on machine learning that try to predict the patients' response to therapies. In this bioinformatics study we deal with data preprocessing techniques to find significant features in HIV sequences that can be interesting for the prediction of patients' short-term progression. Experiments were conducted trough four classification methods using datasets composed by different sets of attributes. Classifiers trained with a dataset including solely viral load, CD4+ cell counts and information about mutations in the viral genome achieved accuracies ranging from 50.29% to 63.87%. Nevertheless, the addition of attributes (antiretroviral drug resistance levels, HIV subtype, epitope occurrence and others) in the dataset has improved the accuracy of the classifiers in almost all tests executed in this work, indicating its relevance to the prediction task discussed here.