A regularized discriminative model for the prediction of protein--peptide interactions

  • Authors:
  • Wolfgang P. Lehrach;Dirk Husmeier;Christopher K. I. Williams

  • Affiliations:
  • University of Edinburgh Edinburgh EH1 2QL, UK;Biomathematics and Statistics Scotland Edinburgh EH9 3JZ, UK;University of Edinburgh Edinburgh EH1 2QL, UK

  • Venue:
  • Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Short well-defined domains known as peptide recognition modules (PRMs) regulate many important protein--protein interactions involved in the formation of macromolecular complexes and biochemical pathways. Since high-throughput experiments like yeast two-hybrid and phage display are expensive and intrinsically noisy, it would be desirable to more specifically target or partially bypass them with complementary in silico approaches. In the present paper, we present a probabilistic discriminative approach to predicting PRM-mediated protein--protein interactions from sequence data. The model is motivated by the discriminative model of Segal and Sharan as an alternative to the generative approach of Reiss and Schwikowski. In our evaluation, we focus on predicting the interaction network. As proposed by Williams, we overcome the problem of susceptibility to over-fitting by adopting a Bayesian a posteriori approach based on a Laplacian prior in parameter space. Results: The proposed method was tested on two datasets of protein--protein interactions involving 28 SH3 domain proteins in Saccharmomyces cerevisiae, where the datasets were obtained with different experimental techniques. The predictions were evaluated with out-of-sample receiver operator characteristic (ROC) curves. In both cases, Laplacian regularization turned out to be crucial for achieving a reasonable generalization performance. The Laplacian-regularized discriminative model outperformed the generative model of Reiss and Schwikowski in terms of the area under the ROC curve on both datasets. The performance was further improved with a hybrid approach, in which our model was initialized with the motifs obtained with the method of Reiss and Schwikowski. Availability: Software and supplementary material is available from http://lehrach.com/wolfgang/dmf Contact: wlehrach@ed.ac.uk