Regular expressions into finite automata
Theoretical Computer Science
An introduction to partial evaluation
ACM Computing Surveys (CSUR)
Distributed query evaluation on semistructured data
ACM Transactions on Database Systems (TODS)
Reachability and Distance Queries via 2-Hop Labels
SIAM Journal on Computing
Using partial evaluation in distributed query evaluation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Mixed mode XML query processing
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Fast computing reachability labelings for large graphs with high compression rate
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Efficiently answering reachability queries on very large directed graphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Social Network Extraction of Academic Researchers
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Fault-tolerant computation of distributed regular path queries
Theoretical Computer Science
On social networks and collaborative recommendation
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Computing label-constraint reachability in graph databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
GRAIL: scalable reachability index for large graphs
Proceedings of the VLDB Endowment
Patterns of temporal variation in online media
Proceedings of the fourth ACM international conference on Web search and data mining
Adding regular expressions to graph reachability and pattern queries
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Fast computation of reachability labeling for large graphs
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Defining and evaluating network communities based on ground-truth
Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics
Performance guarantees for distributed reachability queries
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Nowadays, there is an explosion of Internet information, which is normally distributed on different sites. Hence, efficient finding information becomes difficult. Efficient query evaluation on distributed graphs is an important research topic since it can be used in real applications such as: social network analysis, web mining, ontology matching, etc. A widely-used query on distributed graphs is the regular reachability query (RRQ). A RRQ verifies whether a node can reach another node by a path satisfying a regular expression. Traditionally RRQs are evaluated by distributed depth-first search or distributed breadth-first search methods. However, these methods are restricted by the total network traffic and the response time on large graphs. Recently, Wenfei Fan et al. proposed an approach for improving reachability queries by visiting each site only once, but it has a communication bottleneck problem when assembling all distributed partial query results. In this paper, we propose two algorithms in order to improve Wenfei Fan's algorithm for RRQs. The first algorithm filters and removes redundant nodes/edges on each local site, in parallel. The second algorithm limits the data transfers by local contraction of the partial result. We extensively evaluated our algorithms on MapReduce using YouTube and DBLP datasets. The experimental results show that our method reduces unnecessary data transfers at most 60%, this solves the communication bottleneck problem.