Path-tree: An efficient reachability indexing scheme for large directed graphs

Authors:
Ruoming Jin;Ning Ruan;Yang Xiang;Haixun Wang
Affiliations:
Kent State University;Kent State University;The Ohio State University;Microsoft Research Asia
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2011

Citing 18
Cited 7

Efficient algorithms for finding minimum spanning trees in undirected and directed graphs

Combinatorica
An improved algorithm for transitive closure on acyclic digraphs

Theoretical Computer Science - Thirteenth International Colloquim on Automata, Languages and Programming, Renne
Efficient management of transitive relationships in large data and knowledge bases

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
A compression technique to materialize transitive closure

ACM Transactions on Database Systems (TODS)
Introduction to Algorithms

Introduction to Algorithms
Towards Compressing Web Graphs

DCC '01 Proceedings of the Data Compression Conference
Reachability and Distance Queries via 2-Hop Labels

SIAM Journal on Computing
Stack-based algorithms for pattern matching on DAGs

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Dual Labeling: Answering Graph Reachability Queries in Constant Time

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Fast and practical indexing and querying of very large graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Computational Geometry: Algorithms and Applications

Computational Geometry: Algorithms and Applications
Graph summarization with bounded error

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficiently answering reachability queries on very large directed graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
3-HOP: a high-compression indexing scheme for reachability query

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Evaluating Reachability Queries over Path Collections

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Computing label-constraint reachability in graph databases

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
GRAIL: scalable reachability index for large graphs

Proceedings of the VLDB Endowment
Fast computation of reachability labeling for large graphs

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology

k-Neighborhood decentralization: A comprehensive solution to index the UMLS for large scale knowledge discovery

Journal of Biomedical Informatics
Adding logical operators to tree pattern queries on graph-structured data

Proceedings of the VLDB Endowment
SCARAB: scaling reachability computation on large graphs

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Transitive closure and recursive Datalog implemented on clusters

Proceedings of the 15th International Conference on Extending Database Technology
TF-Label: a topological-folding labeling scheme for reachability querying in a large graph

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Simple, fast, and scalable reachability oracle

Proceedings of the VLDB Endowment
Generalized Hybrid Encoding of Polyhierarchical Structures

Fundamenta Informaticae - To Andrzej Skowron on His 70th Birthday

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reachability query is one of the fundamental queries in graph database. The main idea behind answering reachability queries is to assign vertices with certain labels such that the reachability between any two vertices can be determined by the labeling information. Though several approaches have been proposed for building these reachability labels, it remains open issues on how to handle increasingly large number of vertices in real-world graphs, and how to find the best tradeoff among the labeling size, the query answering time, and the construction time. In this article, we introduce a novel graph structure, referred to as path-tree, to help labeling very large graphs. The path-tree cover is a spanning subgraph of G in a tree shape. We show path-tree can be generalized to chain-tree which theoretically can has smaller labeling cost. On top of path-tree and chain-tree index, we also introduce a new compression scheme which groups vertices with similar labels together to further reduce the labeling size. In addition, we also propose an efficient incremental update algorithm for dynamic index maintenance. Finally, we demonstrate both analytically and empirically the effectiveness and efficiency of our new approaches.