Shift-invariant adaptive double threading: learning MHC II - peptide binding
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Understanding prediction systems for HLA-binding peptides and T-cell epitope identification
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Variable selection through correlation sifting
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Nonparametric combinatorial sequence models
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Using HLA binding prediction algorithms for epitope mapping in HIV vaccine clinical trials
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Template-based scoring functions for visualising biological insights of H-2Kb-peptide-TCR complexes
International Journal of Data Mining and Bioinformatics
Hi-index | 3.84 |
Motivation and results: Motivated by the ability of a simple threading approach to predict MHC I—peptide binding, we developed a new and improved structure-based model for which parameters can be estimated from additional sources of data about MHC-peptide binding. In addition to the known 3D structures of a small number of MHC-peptide complexes that were used in the original threading approach, we included three other sources of information on peptide-MHC binding: (1) MHC class I sequences; (2) known binding energies for a large number of MHC-peptide complexes; and (3) an even larger binary dataset that contains information about strong binders (epitopes) and non-binders (peptides that have a low affinity for a particular MHC molecule). Our model significantly outperforms the standard threading approach in binding energy prediction. In our approach, which we call adaptive double threading, the parameters of the threading model are learnable, and both MHC and peptide sequences can be threaded onto structures of other alleles. These two properties make our model appropriate for predicting binding for alleles for which very little data (if any) is available beyond just their sequence, including prediction for alleles for which 3D structures are not available. The ability of our model to generalize beyond the MHC types for which training data is available also separates our approach from epitope prediction methods which treat MHC alleles as symbolic types, rather than biological sequences. We used the trained binding energy predictor to study viral infections in 246 HIV patients from the West Australian cohort, and over 1000 sequences in HIV clade B from Los Alamos National Laboratory database, capturing the course of HIV evolution over the last 20 years. Finally, we illustrate short-, medium-, and long-term adaptation of HIV to the human immune system. Availability: Contact: jojic@microsoft.com