Using Gaussian Process with Test Rejection to Detect T-Cell Epitopes in Pathogen Genomes

Authors:
Liwen You;Vladimir Brusic;Marcus Gallagher;Mikael Boden
Affiliations:
University of Lund, Lund;Dana-Farber Cancer Institute, Boston;University of Queensland, St Lucia;University of Queensland, St Lucia
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2010

Citing 2
Cited 0

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Shift-invariant adaptive double threading: learning MHC II - peptide binding

RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

A major challenge in the development of peptide-based vaccines is finding the right immunogenic element, with efficient and long-lasting immunization effects, from large potential targets encoded by pathogen genomes. Computer models are convenient tools for scanning pathogen genomes to preselect candidate immunogenic peptides for experimental validation. Current methods predict many false positives resulting from a low prevalence of true positives. We develop a test reject method based on the prediction uncertainty estimates determined by Gaussian process regression. This method filters false positives among predicted epitopes from a pathogen genome. The performance of stand-alone Gaussian process regression is compared to other state-of-the-art methods using cross validation on 11 benchmark data sets. The results show that the Gaussian process method has the same accuracy as the top performing algorithms. The combination of Gaussian process regression with the proposed test reject method is used to detect true epitopes from the Vaccinia virus genome. The test rejection increases the prediction accuracy by reducing the number of false positives without sacrificing the method's sensitivity. We show that the Gaussian process in combination with test rejection is an effective method for prediction of T-cell epitopes in large and diverse pathogen genomes, where false positives are of concern.