Bisimulation reduction of big graphs on mapreduce

Authors:
Yongming Luo;Yannick de Lange;George H. L. Fletcher;Paul De Bra;Jan Hidders;Yuqing Wu
Affiliations:
Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands;Delft University of Technology, The Netherlands;Indiana University, Bloomington
Venue:
BNCOD'13 Proceedings of the 29th British National conference on Big Data
Year:
2013

Citing 18
Cited 2

Three partition refinement algorithms

SIAM Journal on Computing
Index Structures for Path Expressions

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An efficient algorithm for computing bisimulation equivalence

Theoretical Computer Science
A distributed algorithm for strong bisimulation reduction of state spaces

International Journal on Software Tools for Technology Transfer (STTT) - Special section on parallel and distributed model checking
Path queries on compressed XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Graph Twiddling in a MapReduce World

Computing in Science and Engineering
Power-Law Distributions in Empirical Data

SIAM Review
Design patterns for efficient graph algorithms in MapReduce

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Data-Intensive Text Processing with MapReduce

Data-Intensive Text Processing with MapReduce
LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Advanced Topics in Bisimulation and Coinduction

Advanced Topics in Bisimulation and Coinduction
Query preserving graph compression

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Adaptive MapReduce using situation-aware mappers

Proceedings of the 15th International Conference on Extending Database Technology
Graph pattern matching revised for social network analysis

Proceedings of the 15th International Conference on Database Theory
Load Balancing in MapReduce Based on Scalable Cardinality Estimates

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
A structural approach to indexing triples

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications

External memory K-bisimulation reduction of big graphs

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficiency and precision trade-offs in graph summary algorithms

Proceedings of the 17th International Database Engineering & Applications Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing the bisimulation partition of a graph is a fundamental problem which plays a key role in a wide range of basic applications. Intuitively, two nodes in a graph are bisimilar if they share basic structural properties such as labeling and neighborhood topology. In data management, reducing a graph under bisimulation equivalence is a crucial step, e.g., for indexing the graph for efficient query processing. Often, graphs of interest in the real world are massive; examples include social networks and linked open data. For analytics on such graphs, it is becoming increasingly infeasible to rely on in-memory or even I/O-efficient solutions. Hence, a trend in Big Data analytics is the use of distributed computing frameworks such as MapReduce. While there are both internal and external memory solutions for efficiently computing bisimulation, there is, to our knowledge, no effective MapReduce-based solution for bisimulation. Motivated by these observations we propose in this paper the first efficient MapReduce-based algorithm for computing the bisimulation partition of massive graphs. We also detail several optimizations for handling the data skew which often arises in real-world graphs. The results of an extensive empirical study are presented which demonstrate the effectiveness and scalability of our solution.