Predicting protein-peptide binding affinity by learning peptide-peptide distance functions

Authors:
Chen Yanover;Tomer Hertz
Affiliations:
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel;School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
Venue:
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Year:
2005

Citing 3
Cited 1

Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Boosting margin based distance functions for clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning distance functions for image retrieval

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

Leveraging information across HLA alleles/supertypes improves epitope prediction

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many important cellular response mechanisms are activated when a peptide binds to an appropriate receptor. In the immune system, the recognition of pathogen peptides begins when they bind to cell membrane Major Histocompatibility Complexes (MHCs). MHC proteins then carry these peptides to the cell surface in order to allow the activation of cytotoxic T-cells. The MHC binding cleft is highly polymorphic and therefore protein-peptide binding is highly specific. Developing computational methods for predicting protein-peptide binding is important for vaccine design and treatment of diseases like cancer. Previous learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose a novel approach for predicting binding affinity. Our approach is based on learning a peptide-peptide distance function. Moreover, we learn a single peptide-peptide distance function over an entire family of proteins (e.g MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically we propose to use DistBoost [1, 2], which is a semi-supervised distance learning algorithm. We compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, DistBoost's performance gain, when compared to other computational methods, is even more pronounced.