Classification of MHC I proteins according to their ligand-type specificity

Authors:
Eduardo Martínez-Naves;Esther M. Lafuente;Pedro A. Reche
Affiliations:
Department of Microbiology I-Immunology, Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain;Department of Microbiology I-Immunology, Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain;Laboratory of Immunomedicine, Universidad Complutense de Madrid, Madrid, Spain and Department of Microbiology I-Immunology, Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain
Venue:
ICARIS'11 Proceedings of the 10th international conference on Artificial immune systems
Year:
2011

Citing 3
Cited 0

Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Data mining in bioinformatics using Weka

Bioinformatics
Top 10 algorithms in data mining

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Major histocompatibility complex class I (MHC I) molecules belong to a large and diverse protein superfamily whose families can be divided in three groups according to the type of ligands that they can accommodate (ligand-type specificity): peptides, lipids or none. Here, we assembled a dataset of MHC I proteins of known ligand-type specificity (MHCI556 dataset) and trained k-nearest neighbor and support vector machine algorithms. In cross-validation, the resulting classifiers predicted the ligand-type specificity of MHC I molecules with an accuracy ≥ 99%, using solely their amino acid composition. By holding out entire MHC I families prior to model building, we proved that ML-based classifiers trained on amino acid composition are capable of predicting the ligand-type specificity of MHC I molecules unrelated to those used for model building. Moreover, they are superior to BLAST at predicting the class of MHC I molecules that do not bind any ligand.