SubGemini: identifying subcircuits using a fast subgraph isomorphism algorithm
DAC '93 Proceedings of the 30th international Design Automation Conference
An Algorithm for Subgraph Isomorphism
Journal of the ACM (JACM)
An Image Similarity Measure Based on Graph Matching
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Graph indexing: a frequent structure-based approach
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs
IEEE Transactions on Pattern Analysis and Machine Intelligence
Frequent Substructure-Based Approaches for Classifying Chemical Compounds
IEEE Transactions on Knowledge and Data Engineering
Graph indexing based on discriminative frequent structure analysis
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Fg-index: towards verification-free query processing on graph databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A novel spectral coding in a large graph database
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Graphs-at-a-time: query language and access methods for graph databases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Taming verification hardness: an efficient algorithm for testing subgraph isomorphism
Proceedings of the VLDB Endowment
GADDI: distance index based subgraph matching in biological networks
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
TALE: A Tool for Approximate Large Graph Matching
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On graph query optimization in large networks
Proceedings of the VLDB Endowment
iGraph: a framework for comparisons of disk-based graph indexing techniques
Proceedings of the VLDB Endowment
Neighborhood based fast graph search in large networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient subgraph matching on billion node graphs
Proceedings of the VLDB Endowment
An in-depth comparison of subgraph isomorphism algorithms in graph databases
Proceedings of the VLDB Endowment
TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Given a query graph q and a data graph g, the subgraph isomorphism search finds all occurrences of q in g and is considered one of the most fundamental query types for many real applications. While this problem belongs to NP-hard, many algorithms have been proposed to solve it in a reasonable time for real datasets. However, a recent study has shown, through an extensive benchmark with various real datasets, that all existing algorithms have serious problems in their matching order selection. Furthermore, all algorithms blindly permutate all possible mappings for query vertices, often leading to useless computations. In this paper, we present an efficient and robust subgraph search solution, called TurboISO, which is turbo-charged with two novel concepts, candidate region exploration and the combine and permute strategy (in short, Comb/Perm). The candidate region exploration identifies on-the-fly candidate subgraphs (i.e, candidate regions), which contain embeddings, and computes a robust matching order for each candidate region explored. The Comb/Perm strategy exploits the novel concept of the neighborhood equivalence class (NEC). Each query vertex in the same NEC has identically matching data vertices. During subgraph isomorphism search, Comb/Perm generates only combinations for each NEC instead of permutating all possible enumerations. Thus, if a chosen combination is determined to not contribute to a complete solution, all possible permutations for that combination will be safely pruned. Extensive experiments with many real datasets show that TurboISO consistently and significantly outperforms all competitors by up to several orders of magnitude.