Probabilistic inference of molecular networks from noisy data sources

  • Authors:
  • Ivan Iossifov;Michael Krauthammer;Carol Friedman;Vasileios Hatzivassiloglou;Joel S. Bader;Kevin P. White;Andrey Rzhetsky

  • Affiliations:
  • Department of Medical Informatics,;Department of Medical Informatics,;Department of Medical Informatics,;Department of Computer Science, Columbia University, New York, NY 10027, USA,;CuraGen Corporation, New Haven, CT 06511, USA;Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA;Department of Medical Informatics,

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Summary: Information on molecular networks, such as networks of interacting proteins, comes from diverse sources that contain remarkable differences in distribution and quantity of errors. Here, we introduce a probabilistic model useful for predicting protein interactions from heterogeneous data sources. The model describes stochastic generation of protein--protein interaction networks with real-world properties, as well as generation of two heterogeneous sources of protein-interaction information: research results automatically extracted from the literature and yeast two-hybrid experiments. Based on the domain composition of proteins, we use the model to predict protein interactions for pairs of proteins for which no experimental data are available. We further explore the prediction limits, given experimental data that cover only part of the underlying protein networks. This approach can be extended naturally to include other types of biological data sources.