Human papillomavirus risk type classification from protein sequences using support vector machines

  • Authors:
  • Sun Kim;Byoung-Tak Zhang

  • Affiliations:
  • Biointelligence Laboratory, School of Computer Science and Engineering, Seoul National University, Seoul, South Korea;Biointelligence Laboratory, School of Computer Science and Engineering, Seoul National University, Seoul, South Korea

  • Venue:
  • EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Infection by the human papillomavirus (HPV) is associated with the development of cervical cancer. HPV can be classified to high- and low-risk type according to its malignant potential, and detection of the risk type is important to understand the mechanisms and diagnose potential patients. In this paper, we classify the HPV protein sequences by support vector machines. A string kernel is introduced to discriminate HPV protein sequences. The kernel emphasizes amino acids pairs with a distance. In the experiments, our approach is compared with previous methods in accuracy and F1-score, and it has showed better performance. Also, the prediction results for unknown HPV types are presented.