A bridging model for parallel computation
Communications of the ACM
Communicating sequential processes
Communications of the ACM
Parallel Multilevel Graph Partitioning
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast PGAS Implementation of Distributed Graph Algorithms
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of the 20th international conference on World wide web
Distributed GraphLab: a framework for machine learning and data mining in the cloud
Proceedings of the VLDB Endowment
Streaming graph partitioning for large distributed graphs
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PowerGraph: distributed graph-parallel computation on natural graphs
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Mizan: a system for dynamic load balancing in large-scale graph processing
Proceedings of the 8th ACM European Conference on Computer Systems
Catch the Wind: Graph workload balancing on cloud
ICDE '13 Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013)
Hi-index | 0.00 |
Graph partitioning is one of the key components in parallel graph computation, and the partition quality significantly affects the overall computing performance. In the existing graph computing systems, ``good'' partition schemes are preferred as they have smaller edge cut ratio and hence reduce the communication cost among working nodes. However, in an empirical study on Giraph[1], we found that the performance over well partitioned graph might be even two times worse than simple partitions. The cause is that the local message processing cost in graph computing systems may surpass the communication cost in several cases. In this paper, we analyse the cost of parallel graph computing systems as well as the relationship between the cost and underlying graph partitioning. Based on these observation, we propose a novel Partition Aware Graph computation Engine named PAGE. PAGE is equipped with two newly designed modules, i.e., the communication module with a dual concurrent message processor, and a partition aware one to monitor the system's status. The monitored information can be utilized to dynamically adjust the concurrency of dual concurrent message processor with a novel Dynamic Concurrency Control Model (DCCM). The DCCM applies several heuristic rules to determine the optimal concurrency for the message processor. We have implemented a prototype of PAGE and conducted extensive studies on a moderate size of cluster. The experimental results clearly demonstrate the PAGE's robustness under different graph partition qualities and show its advantages over existing systems with up to 59% improvement.