Protein interaction prediction using inferred domain interactions and biologically-significant negative dataset

  • Authors:
  • Xiao-Li Li;Soon-Heng Tan;See-Kiong Ng

  • Affiliations:
  • Institute For Infocomm Research, Singapore;Institute For Infocomm Research, Singapore;Institute For Infocomm Research, Singapore

  • Venue:
  • ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein domains are evolutionarily-conserved structural or functional subunits in proteins that are suggestive of the proteins' propensity to interact or form a stable complex. In this paper, we propose a novel domain-based probabilistic classification method to predict protein-protein interactions. Our method learns the interacting probabilities of domain pairs based on domain pairing information derived from both experimentally-determined interacting protein pairs and carefully-chosen non-interacting protein pairs. Unlike conventional approaches that use random pairing to generate artificial non-interacting protein pairs as negative training data, we generate biologically meaningful non-interacting protein pairs based on the proteins' biological information. Such careful generation of negative training data set is shown to result in a more accurate classifier. Our classifier predicts potential interaction between any pair of proteins based on the probabilistically inferred domain interactions. Comparative results showed that our probabilistic approach is effective and outperforms other domain-based techniques for protein interaction prediction.