Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

Authors:
Wook-Shin Han;Jinsoo Lee;Jeong-Hoon Lee
Affiliations:
Kyungpook National University, Daegu, South Korea;Kyungpook National University, Daegu, South Korea;Kyungpook National University, Daegu, South Korea
Venue:
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Year:
2013

Citing 19
Cited 1

SubGemini: identifying subcircuits using a fast subgraph isomorphism algorithm

DAC '93 Proceedings of the 30th international Design Automation Conference
An Algorithm for Subgraph Isomorphism

Journal of the ACM (JACM)
An Image Similarity Measure Based on Graph Matching

SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Graph indexing: a frequent structure-based approach

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs

IEEE Transactions on Pattern Analysis and Machine Intelligence
Frequent Substructure-Based Approaches for Classifying Chemical Compounds

IEEE Transactions on Knowledge and Data Engineering
Graph indexing based on discriminative frequent structure analysis

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Fg-index: towards verification-free query processing on graph databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Graph indexing: tree + delta

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A novel spectral coding in a large graph database

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Graphs-at-a-time: query language and access methods for graph databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Taming verification hardness: an efficient algorithm for testing subgraph isomorphism

Proceedings of the VLDB Endowment
GADDI: distance index based subgraph matching in biological networks

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
TALE: A Tool for Approximate Large Graph Matching

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On graph query optimization in large networks

Proceedings of the VLDB Endowment
iGraph: a framework for comparisons of disk-based graph indexing techniques

Proceedings of the VLDB Endowment
Neighborhood based fast graph search in large networks

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient subgraph matching on billion node graphs

Proceedings of the VLDB Endowment
An in-depth comparison of subgraph isomorphism algorithms in graph databases

Proceedings of the VLDB Endowment

TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a query graph q and a data graph g, the subgraph isomorphism search finds all occurrences of q in g and is considered one of the most fundamental query types for many real applications. While this problem belongs to NP-hard, many algorithms have been proposed to solve it in a reasonable time for real datasets. However, a recent study has shown, through an extensive benchmark with various real datasets, that all existing algorithms have serious problems in their matching order selection. Furthermore, all algorithms blindly permutate all possible mappings for query vertices, often leading to useless computations. In this paper, we present an efficient and robust subgraph search solution, called TurboISO, which is turbo-charged with two novel concepts, candidate region exploration and the combine and permute strategy (in short, Comb/Perm). The candidate region exploration identifies on-the-fly candidate subgraphs (i.e, candidate regions), which contain embeddings, and computes a robust matching order for each candidate region explored. The Comb/Perm strategy exploits the novel concept of the neighborhood equivalence class (NEC). Each query vertex in the same NEC has identically matching data vertices. During subgraph isomorphism search, Comb/Perm generates only combinations for each NEC instead of permutating all possible enumerations. Thus, if a chosen combination is determined to not contribute to a complete solution, all possible permutations for that combination will be safely pruned. Extensive experiments with many real datasets show that TurboISO consistently and significantly outperforms all competitors by up to several orders of magnitude.