Using Machine Learning to Predict the Health of HIV-Infected Patients

  • Authors:
  • Charles L. Cole;Brian R. King

  • Affiliations:
  • Bucknell University, Lewisburg, PA 17837;Bucknell University, Lewisburg, PA 17837

  • Venue:
  • Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Human immunodeficiency virus-1 is a complex retrovirus that gradually destroys the body's immune system, making it harder for the individual to fight infections. The worst prognosis for an infected individual is AIDS; however this result does not occur with everyone. Moreover, not every infected person develops AIDS at the same rate [5]. We developed a method that can predict the disease prognosis of human HIV infections based on non-redundant HIV genomic and proteomic sequence data. Using the random forest classification method on the genomic data, we obtained over 91% accuracy over four different disease levels. We also analyzed the proteins expressed from five of the nine genes in HIV. We found that the rev gene had the highest predictive performance for disease level. Using a decision tree, we were able to output rules that contained specific variants in the protein that can suggest disease outcomes. This information may help researchers understand underlying variants of the gene that have different patient outcomes. Moreover, this knowledge can improve the selection of appropriate treatment methods depending on the predicted infection level, and also improve drug targeting.