Assessing significance of connectivity and conservation in protein interaction networks

Authors:
Mehmet Koyutürk;Ananth Grama;Wojciech Szpankowski
Affiliations:
Dept. of Computer Sciences, Purdue University, West Lafayette, IN;Dept. of Computer Sciences, Purdue University, West Lafayette, IN;Dept. of Computer Sciences, Purdue University, West Lafayette, IN
Venue:
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Year:
2006

Citing 12
Cited 2

A simple min-cut algorithm

Journal of the ACM (JACM)
A random graph model for massive graphs

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
A clustering algorithm based on graph connectivity

Information Processing Letters
Average Case Analysis of Algorithms on Sequences

Average Case Analysis of Algorithms on Sequences
Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Functional topology in a network of protein interactions

Bioinformatics
Modeling interactome: scale-free or geometric?

Bioinformatics
An efficient algorithm for detecting frequent subgraphs in biological networks

Bioinformatics
Mining coherent dense subgraphs across massive biological networks for functional discovery

Bioinformatics
Topology of small-world networks of protein--protein complex structures

Bioinformatics
Local modeling of global interactome networks

Bioinformatics
Pairwise local alignment of protein interaction networks guided by models of evolution

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology

Connectedness profiles in protein networks for the analysis of gene expression data

RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
On the hardness of optimization in power law graphs

COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computational and comparative analysis of protein-protein interaction (PPI) networks enable understanding of the modular organization of the cell through identification of functional modules and protein complexes. These analysis techniques generally rely on topological features such as connectedness, based on the premise that functionally related proteins are likely to interact densely and that these interactions follow similar evolutionary trajectories. Significant recent work in our lab, and in other labs has focused on efficient algorithms for identification of modules and their conservation. Application of these methods to a variety of networks has yielded novel biological insights. In spite of algorithmic advances, development of a comprehensive infrastructure for interaction databases is in relative infancy compared to corresponding sequence analysis tools such as BLAST and CLUSTAL. One critical component of this infrastructure is a measure of the statistical significance of a match or a dense subcomponent. Corresponding sequence-based measures such as E-values are key components of sequence matching tools. In the absence of an analytical measure, conventional methods rely on computer simulations based on ad-hoc models for quantifying significance. This paper presents the first such effort, to the best of our knowledge, aimed at analytically quantifying statistical significance of dense components and matches in reference model graphs. We consider two reference graph models – a G(n,p) model in which each pair of nodes has an identical likelihood, p, of sharing an edge, and a two-level G(n,p) model, which accounts for high-degree hub nodes generally occurring in PPI networks. We argue that by choosing conservatively the value of p, the G(n,p) model will dominate that of the power-law graph that is often used to model PPI networks. We also propose a method for evaluating statistical significance based on the results derived from this analysis, and demonstrate the use of these measures for assessing significant structures in PPI networks. Experiments performed on a rich collection of PPI networks show that the proposed model provides a reliable means of evaluating statistical significance of dense patterns in these networks.