iGraph: a framework for comparisons of disk-based graph indexing techniques

Authors:
Wook-Shin Han;Jinsoo Lee;Minh-Duc Pham;Jeffrey Xu Yu
Affiliations:
Kyungpook National University, Korea;Kyungpook National University, Korea;Kyungpook National University, Korea;Chinese University of Hong Kong, Hong Kong
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 20
Cited 12

An Algorithm for Subgraph Isomorphism

Journal of the ACM (JACM)
Algorithmics and applications of tree and graph searching

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Database Management Systems

Database Management Systems
Similarity Searching in Medical Image Databases

IEEE Transactions on Knowledge and Data Engineering
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Graph indexing: a frequent structure-based approach

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs

IEEE Transactions on Pattern Analysis and Machine Intelligence
Substructure similarity search in graph databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
SOBER: statistical model-based bug localization

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Graph indexing based on discriminative frequent structure analysis

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Closure-Tree: An Index Structure for Graph Queries

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Fg-index: towards verification-free query processing on graph databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Graph indexing: tree + delta

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A novel spectral coding in a large graph database

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Taming verification hardness: an efficient algorithm for testing subgraph isomorphism

Proceedings of the VLDB Endowment
TALE: A Tool for Approximate Large Graph Matching

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A quantitative comparison of the subgraph miners mofa, gspan, FFSM, and gaston

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Multi-resolution similarity hashing

Digital Investigation: The International Journal of Digital Forensics & Incident Response

iGraph in action: performance analysis of disk-based graph indexing techniques

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
CP-index: on the efficient indexing of large graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
Finding top-k similar graphs in graph databases

Proceedings of the 15th International Conference on Extending Database Technology
ECTree: an extended tree index for attributed subgraph queries

Proceedings of the 16th International Database Engineering & Applications Sysmposium
Efficient algorithms for generalized subgraph query processing

Proceedings of the 21st ACM international conference on Information and knowledge management
An in-depth comparison of subgraph isomorphism algorithms in graph databases

Proceedings of the VLDB Endowment
Compressed feature-based filtering and verification approach for subgraph search

Proceedings of the 16th International Conference on Extending Database Technology
Lindex: a lattice-based index for graph databases

The VLDB Journal — The International Journal on Very Large Data Bases
Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining and indexing graphs for supergraph search

Proceedings of the VLDB Endowment
Querying business process model repositories

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graphs are of growing importance in modeling complex structures such as chemical compounds, proteins, images, and program dependence. Given a query graph Q, the subgraph isomorphism problem is to find a set of graphs containing Q from a graph database, which is NP-complete. Recently, there have been a lot of research efforts to solve the subgraph isomorphism problem for a large graph database by utilizing graph indexes. By using a graph index as a filter, we prune graphs that are not real answers at an inexpensive cost. Then, we need to use expensive subgraph isomorphism tests to verify filtered candidates only. This way, the number of disk I/Os and subgraph isomorphism tests can be significantly minimized. The current practice for experiments in graph indexing techniques is that the author of a newly proposed technique does not implement existing indexes on his own code base, but instead uses the original authors' binary executables and reports only the wall clock time. However, we observe this practice may result in several problems. In order to address these problems, we have made significant efforts in implementing all representative indexing methods on a common framework called iGraph. Unlike existing implementations which either use (full or partial) in-memory representations or rely on OS file system cache without guaranteeing real disk I/Os, we have implemented these indexes on top of a storage engine that guarantees real disk I/Os. Through extensive experiments using many synthetic and real datasets, we also provide new empirical findings in the performance of the full disk-based implementations of these methods.