Solving the multiple instance problem with axis-parallel rectangles
Artificial Intelligence
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiple instance learning of real valued data
The Journal of Machine Learning Research
A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic mRNAs
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Neural Computation
Prediction of MHC class II binders using the ant colony search strategy
Artificial Intelligence in Medicine
Shift-invariant adaptive double threading: learning MHC II - peptide binding
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Novel machine learning methods for MHC Class I binding prediction
PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Biological Sequence Classification with Multivariate String Kernels
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
Human adaptive immune response relies on the recognition of short peptides through proteins of the major histocompatibility complex (MHC). MHC class II molecules are responsible for the recognition of antigens external to a cell. Understanding their specificity is an important step in the design of peptide-based vaccines. The high degree of polymorphism in MHC class II makes the prediction of peptides that bind (and then usually cause an immune response) a challenging task. Typically, these predictions rely on machine learning methods, thus a sufficient amount of data points is required. Due to the scarcity of data, currently there are reliable prediction models only for about 7% of all known alleles available.We show how to transform the problem of MHC class II binding peptide prediction into a well-studied machine learning problem called multiple instance learning. For alleles with sufficient data, we show how to build a well-performing predictor using standard kernels for multiple instance learning. Furthermore, we introduce a new method for training a classifier of an allele without the necessity for binding allele data of the target allele. Instead, we use binding peptide data from other alleles and similarities between the structures of the MHC class II alleles to guide the learning process. This allows for the first time constructing predictors for about two thirds of all known MHC class II alleles. The average performance of these predictors on 14 test alleles is 0.71, measured as area under the ROC curve.Availability:The methods are integrated into the EpiToolKit framework for which there exists a webserver at http://www.epitoolkit.org/mhciimulti