Neural networks and the bias/variance dilemma
Neural Computation
Text filtering by boosting naive Bayes classifiers
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Protein sequence-based risk classification for human papillomaviruses
Computers in Biology and Medicine
Human papillomavirus risk type classification from protein sequences using support vector machines
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Improving protein secondary structure prediction using a multi-modal BP method
Computers in Biology and Medicine
Hi-index | 0.00 |
Infection by the human papillomavirus (HPV) is regarded as the major risk factor in the development of cervical cancer. Detection of high-risk HPV is important for understanding its oncogenic mechanisms and for developing novel clinical tools for its diagnosis, treatment, and prevention. Several methods are available to predict the risk types for HPV protein sequences. Nevertheless, no tools can achieve a universally good performance for all domains, including HPV and nor do they provide confidence levels for their decisions. Here, we describe ensembled support vector machines (SVMs) to classify HPV risk types, which assign given proteins into high-, possibly high-, or low-risk type based on their confidence level. Our approach uses protein secondary structures to obtain the differential contribution of subsequences for the risk type, and SVM classifiers are combined with a simple but efficient string kernel to handle HPV protein sequences. In the experiments, we compare our approach with previous methods in accuracy and F1-score, and present the predictions for unknown HPV types, which provides promising results.