Biomolecular network motif counting and discovery by color coding

Authors:
Noga Alon;Phuong Dao;Iman Hajirasouliha;Fereydoun Hormozdiari;S. Cenk Sahinalp
Affiliations:
-;-;-;-;-
Venue:
Bioinformatics
Year:
2008

Citing 0
Cited 9

Approximating the Number of Network Motifs

WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
Finding, minimizing, and counting weighted subgraphs

Proceedings of the forty-first annual ACM symposium on Theory of computing
Balanced families of perfect hash functions and their applications

ACM Transactions on Algorithms (TALG)
Quantifying systemic evolutionary changes by color coding confidence-scored PPI networks

WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Counting stars and other small subgraphs in sublinear time

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Unique small subgraphs are not easier to find

LATA'11 Proceedings of the 5th international conference on Language and automata theory and applications
Counting and detecting small subgraphs via equations and matrix multiplication

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Faster algorithms for finding and counting subgraphs

Journal of Computer and System Sciences
Symmetry Compression Method for Discovering Network Motifs

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Protein–protein interaction (PPI) networks of many organisms share global topological features such as degree distribution, k-hop reachability, betweenness and closeness. Yet, some of these networks can differ significantly from the others in terms of local structures: e.g. the number of specific network motifs can vary significantly among PPI networks. Counting the number of network motifs provides a major challenge to compare biomolecular networks. Recently developed algorithms have been able to count the number of induced occurrences of subgraphs with k≤ 7 vertices. Yet no practical algorithm exists for counting non-induced occurrences, or counting subgraphs with k≥ 8 vertices. Counting non-induced occurrences of network motifs is not only challenging but also quite desirable as available PPI networks include several false interactions and miss many others. In this article, we show how to apply the ‘color coding’ technique for counting non-induced occurrences of subgraph topologies in the form of trees and bounded treewidth subgraphs. Our algorithm can count all occurrences of motif G′ with k vertices in a network G with n vertices in time polynomial with n, provided k=O(log n). We use our algorithm to obtain ‘treelet’ distributions for k≤ 10 of available PPI networks of unicellular organisms (Saccharomyces cerevisiae Escherichia coli and Helicobacter Pyloris), which are all quite similar, and a multicellular organism (Caenorhabditis elegans) which is significantly different. Furthermore, the treelet distribution of the unicellular organisms are similar to that obtained by the ‘duplication model’ but are quite different from that of the ‘preferential attachment model’. The treelet distribution is robust w.r.t. sparsification with bait/edge coverage of 70% but differences can be observed when bait/edge coverage drops to 50%. Contact: cenk@cs.sfu.ca