The graph isomorphism problem: its structural complexity
The graph isomorphism problem: its structural complexity
Isomorph-free exhaustive generation
Journal of Algorithms
Communications of the ACM
Discovering Frequent Closed Itemsets for Association Rules
ICDT '99 Proceedings of the 7th International Conference on Database Theory
The complexity of theorem-proving procedures
STOC '71 Proceedings of the third annual ACM symposium on Theory of computing
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Parallel algorithms for mining frequent structural motifs in scientific data
Proceedings of the 18th annual international conference on Supercomputing
Graph indexing: a frequent structure-based approach
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
The political blogosphere and the 2004 U.S. election: divided they blog
Proceedings of the 3rd international workshop on Link discovery
NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Detection of Network Motifs
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Motif Search in Graphs: Application to Metabolic Networks
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Depth-first search and linear grajh algorithms
SWAT '71 Proceedings of the 12th Annual Symposium on Switching and Automata Theory (swat 1971)
Strategies for Network Motifs Discovery
E-SCIENCE '09 Proceedings of the 2009 Fifth IEEE International Conference on e-Science
Network motif discovery using subgraph enumeration and symmetry-breaking
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
g-tries: an efficient data structure for discovering network motifs
Proceedings of the 2010 ACM Symposium on Applied Computing
Efficient subgraph frequency estimation with g-tries
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Efficient Parallel Subgraph Counting Using G-Tries
CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
Parallel discovery of network motifs
Journal of Parallel and Distributed Computing
A faster algorithm for detecting network motifs
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Querying subgraph sets with g-tries
DBSocial '12 Proceedings of the 2nd ACM SIGMOD Workshop on Databases and Social Networks
Hi-index | 0.00 |
The ability to find and count subgraphs of a given network is an important non trivial task with multidisciplinary applicability. Discovering network motifs or computing graphlet signatures are two examples of methodologies that at their core rely precisely on the subgraph counting problem. Here we present the g-trie, a data-structure specifically designed for discovering subgraph frequencies. We produce a tree that encapsulates the structure of the entire graph set, taking advantage of common topologies in the same way a prefix tree takes advantage of common prefixes. This avoids redundancy in the representation of the graphs, thus allowing for both memory and computation time savings. We introduce a specialized canonical labeling designed to highlight common substructures and annotate the g-trie with a set of conditional rules that break symmetries, avoiding repetitions in the computation. We introduce a novel algorithm that takes as input a set of small graphs and is able to efficiently find and count them as induced subgraphs of a larger network. We perform an extensive empirical evaluation of our algorithms, focusing on efficiency and scalability on a set of diversified complex networks. Results show that g-tries are able to clearly outperform previously existing algorithms by at least one order of magnitude.