Graph indexing: tree + delta

Authors:
Peixiang Zhao;Jeffrey Xu Yu;Philip S. Yu
Affiliations:
The Chinese University of Hong Kong;The Chinese University of Hong Kong;IBM T. J. Watson Research Center
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 20
Cited 65

A Step Towards Unification of Syntactic and Statistical Pattern Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence - Special memorial issue for Professor King-Sun Fu
Lore: a database management system for semistructured data

ACM SIGMOD Record
On power-law relationships of the Internet topology

Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Efficient Matching and Indexing of Graph Models in Content-Based Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence - Graph Algorithms and Computer Vision
Algorithmics and applications of tree and graph searching

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
Similarity Searching in Medical Image Databases

IEEE Transactions on Knowledge and Data Engineering
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A Graph Query Language and Its Query Processing

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Graph indexing: a frequent structure-based approach

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A quickstart in frequent structure mining can make a difference

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
SOBER: statistical model-based bug localization

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Closure-Tree: An Index Structure for Graph Queries

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Fast Frequent Free Tree Mining in Graph Databases

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
A platform based on the multi-dimensional data modal for analysis of bio-molecular structures

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Top-k subgraph matching query in a large graph

Proceedings of the ACM first Ph.D. workshop in CIKM
A novel spectral coding in a large graph database

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Graphs-at-a-time: query language and access methods for graph databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Taming verification hardness: an efficient algorithm for testing subgraph isomorphism

Proceedings of the VLDB Endowment
Efficient query processing on graph databases

ACM Transactions on Database Systems (TODS)
GADDI: distance index based subgraph matching in biological networks

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A novel approach for efficient supergraph query processing on graph databases

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
G-hash: towards fast kernel-based similarity search in large graph databases

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
FOGGER: an algorithm for graph generator discovery

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
GraphREL: A Decomposition-Based and Selectivity-Aware Relational Framework for Processing Sub-graph Queries

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Independent informative subgraph mining for graph information retrieval

Proceedings of the 18th ACM conference on Information and knowledge management
Large-scale malware indexing using function-call graphs

Proceedings of the 16th ACM conference on Computer and communications security
Comparing stars: on approximating graph edit distance

Proceedings of the VLDB Endowment
Distance-join: pattern match query in a large graph database

Proceedings of the VLDB Endowment
Graph summaries for subgraph frequency estimation

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Summarization graph indexing: beyond frequent structure-based approach

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
GBLENDER: towards blending visual query formulation and query processing in graph databases

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Processing proximity relations in road networks

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Towards proximity pattern mining in large graphs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Connected substructure similarity search

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
An efficient features-based processing technique for supergraph queries

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
PrefIndex: an efficient supergraph containment search technique

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
DSI: a method for indexing large graphs using distance set

WAIM'10 Proceedings of the 11th international conference on Web-age information management
On graph query optimization in large networks

Proceedings of the VLDB Endowment
iGraph: a framework for comparisons of disk-based graph indexing techniques

Proceedings of the VLDB Endowment
Bit-vector algorithms for binary constraint satisfaction and subgraph isomorphism

Journal of Experimental Algorithmics (JEA)
Efficient algorithms for supergraph query processing on graph databases

Journal of Combinatorial Optimization
Fast business process similarity search with feature-based similarity estimation

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems - Volume Part I
Efficient and accurate retrieval of business process models through indexing

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems - Volume Part I
A tool for fast indexing and querying of graphs

Proceedings of the 20th international conference companion on World wide web
Structure and attribute index for approximate graph matching in large graphs

Information Systems
Neighborhood based fast graph search in large networks

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
iGraph in action: performance analysis of disk-based graph indexing techniques

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Querying business process models based on semantics

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
Aggregated search in graph databases: preliminary results

GbRPR'11 Proceedings of the 8th international conference on Graph-based representations in pattern recognition
GBASE: a scalable and general graph management system

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
BR-index: an indexing structure for subgraph matching in very large dynamic graphs

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Fast graph query processing with a low-cost index

The VLDB Journal — The International Journal on Very Large Data Bases
Answering subgraph queries over large graphs

WAIM'11 Proceedings of the 12th international conference on Web-age information management
DELTA: indexing and querying multi-labeled graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
CP-index: on the efficient indexing of large graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
Answering pattern match queries in large graph databases via graph embedding

The VLDB Journal — The International Journal on Very Large Data Bases
NOVA: a novel and efficient framework for finding subgraph isomorphism mappings in large graphs

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Querying large graph databases

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Fast and exact top-k search for random walk with restart

Proceedings of the VLDB Endowment
Fast business process similarity search

Distributed and Parallel Databases
TreeSpan: efficiently computing similarity all-matching

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Finding top-k similar graphs in graph databases

Proceedings of the 15th International Conference on Extending Database Technology
Indexing and mining topological patterns for drug discovery

Proceedings of the 15th International Conference on Extending Database Technology
A relational-based approach for aggregated search in graph databases

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Efficient subgraph similarity all-matching

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Query-driven discovery of semantically similar substructures in heterogeneous networks

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
gbase: an efficient analysis platform for large graphs

The VLDB Journal — The International Journal on Very Large Data Bases
On efficient processing of BPMN-Q queries

Computers in Industry
Efficient querying of large process model repositories

Computers in Industry
An in-depth comparison of subgraph isomorphism algorithms in graph databases

Proceedings of the VLDB Endowment
Compressed feature-based filtering and verification approach for subgraph search

Proceedings of the 16th International Conference on Extending Database Technology
Lindex: a lattice-based index for graph databases

The VLDB Journal — The International Journal on Very Large Data Bases
A direct mining approach to efficient constrained graph pattern discovery

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Facilitating representation and retrieval of structured cases: Principles and toolkit

Information Systems
SQBC: An efficient subgraph matching method over large and dense graphs

Information Sciences: an International Journal
Efficient processing of graph similarity queries with edit distance constraints

The VLDB Journal — The International Journal on Very Large Data Bases
Querying business process model repositories

World Wide Web
Mining of closed frequent subtrees from frequently updated databases

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent scientific and technological advances have witnessed an abundance of structural patterns modeled as graphs. As a result, it is of special interest to process graph containment queries effectively on large graph databases. Given a graph database G, and a query raph q, the graph containment query is to retrieve all graphs in G which contain q as subgraph(s). Due to the vast number of graphs in G and the nature of complexity for subgraph isomorphism testing, it is desirable to make use of high-quality graph indexing mechanisms to reduce the overall query processing cost. In this paper, we propose a new cost-effective graph indexing method based on frequent tree-features of the graph database. We analyze the effectiveness and efficiency of tree as indexing feature from three critical aspects: feature size, feature selection cost, and pruning power. In order to achieve better pruning ability than existing graph-based indexing methods, we select, in addition to frequent tree-features (Tree), a small number of discriminative graphs (Δ) on demand, without a costly graph mining process beforehand. Our study verifies that (Tree+Δ) is a better choice than graph for indexing purpose, denoted (Tree+Δ ≥Graph), to address the graph containment query problem. It has two implications: (1) the index construction by (Tree+Δ) is efficient, and (2) the graph containment query processing by (Tree+Δ) is efficient. Our experimental studies demonstrate that (Tree+Δ) has a compact index structure, achieves an order of magnitude better performance in index construction, and most importantly, outperforms up-to-date graph-based indexing methods: gIndex and C-Tree, in graph containment query processing.