Finding regular simple paths in graph databases
VLDB '89 Proceedings of the 15th international conference on Very large data bases
An introduction to partial evaluation
ACM Computing Surveys (CSUR)
Principles of distributed database systems (2nd ed.)
Principles of distributed database systems (2nd ed.)
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Distributed query evaluation on semistructured data
ACM Transactions on Database Systems (TODS)
Translating regular expressions into small εe-free nondeterministic finite automata
Journal of Computer and System Sciences
Exact and Approximate Distances in Graphs - A Survey
ESA '01 Proceedings of the 9th Annual European Symposium on Algorithms
Reachability and Distance Queries via 2-Hop Labels
SIAM Journal on Computing
Lifting sequential graph algorithms for distributed-memory parallel computation
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Using partial evaluation in distributed query evaluation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Graph evolution: Densification and shrinking diameters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Distributed query evaluation with performance guarantees
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
A Framework for the Partial Evaluation of SPARQL Queries
SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
PNUTS: Yahoo!'s hosted data serving platform
Proceedings of the VLDB Endowment
Querying and monitoring distributed business processes
Proceedings of the VLDB Endowment
Fault-tolerant computation of distributed regular path queries
Theoretical Computer Science
The cost of a cloud: research problems in data center networks
ACM SIGCOMM Computer Communication Review
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce
CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
Optimizing joins in a map-reduce environment
Proceedings of the 13th International Conference on Extending Database Technology
Querying distributed RDF data sources with SPARQL
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A sub-quadratic algorithm for conjunctive and disjunctive boolean equation systems
ICTAC'05 Proceedings of the Second international conference on Theoretical Aspects of Computing
BNCOD'13 Proceedings of the 29th British National conference on Big Data
Strong simulation: Capturing topology in graph pattern matching
ACM Transactions on Database Systems (TODS)
Efficient query evaluation on distributed graphs with Hadoop environment
Proceedings of the Fourth Symposium on Information and Communication Technology
Minimizing data transfers for regular reachability queries on distributed graphs
Proceedings of the Fourth Symposium on Information and Communication Technology
Simple, fast, and scalable reachability oracle
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In the real world a graph is often fragmented and distributed across different sites. This highlights the need for evaluating queries on distributed graphs. This paper proposes distributed evaluation algorithms for three classes of queries: reachability for determining whether one node can reach another, bounded reachability for deciding whether there exists a path of a bounded length between a pair of nodes, and regular reachability for checking whether there exists a path connecting two nodes such that the node labels on the path form a string in a given regular expression. We develop these algorithms based on partial evaluation, to explore parallel computation. When evaluating a query Q on a distributed graph G, we show that these algorithms possess the following performance guarantees, no matter how G is fragmented and distributed: (1) each site is visited only once; (2) the total network traffic is determined by the size of Q and the fragmentation of G, independent of the size of G; and (3) the response time is decided by the largest fragment of G rather than the entire G. In addition, we show that these algorithms can be readily implemented in the MapReduce framework. Using synthetic and real-life data, we experimentally verify that these algorithms are scalable on large graphs, regardless of how the graphs are distributed.