An integrated probabilistic model for functional prediction of proteins

  • Authors:
  • Minghua Deng;Ting Chen;Fengzhu Sun

  • Affiliations:
  • University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA

  • Venue:
  • RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We develop an integrated probabilistic model to combine protein physical interactions, genetic interactions, highly correlated gene expression network, protein complex data, and domain structures of individual proteins to predict protein functions. The model is an extension of our previous model for protein function prediction based on Markovian random field theory. The model is flexible in that other protein pairwise relationship information and features of individual proteins can be easily incorporated. Two features distinguish the integrated approach from other available methods for protein function prediction. One is that the integrated approach uses all available sources of information with different weights for different sources of data. It is a global approach that takes the whole network into consideration. The second feature is that the posterior probability that a protein has the function of interest is assigned. The posterior probability indicates how confident we are about assigning the function to the protein. We apply our integrated approach to predict functions of yeast proteins based upon MIPS protein function classifications and upon the interaction networks based on MIPS physical and genetic interactions, gene expression profiles, Tandem Affinity Purification (TAP) protein complex data, and protein domain information. We study the sensitivity and specificity of the integrated approach using different sources of information by the leave-one-out approach. In contrast to using MIPS physical interactions only, the integrated approach combining all of the information increases the sensitivity from 57% to 87% when the specificity is set at 57%-an increase of 30%. It should also be noted that enlarging the interaction network greatly increases the number of proteins whose functions can be predicted.