Mizan: a system for dynamic load balancing in large-scale graph processing

Authors:
Zuhair Khayyat;Karim Awara;Amani Alonazi;Hani Jamjoom;Dan Williams;Panos Kalnis
Affiliations:
King Abdullah University of Science and Technology, Saudi Arabia;King Abdullah University of Science and Technology, Saudi Arabia;King Abdullah University of Science and Technology, Saudi Arabia;IBM T. J. Watson Research Center, Yorktown Heights, NY;IBM T. J. Watson Research Center, Yorktown Heights, NY;King Abdullah University of Science and Technology, Saudi Arabia
Venue:
Proceedings of the 8th ACM European Conference on Computer Systems
Year:
2013

Citing 21
Cited 3

A bridging model for parallel computation

Communications of the ACM
Multilevel k-way partitioning scheme for irregular graphs

Journal of Parallel and Distributed Computing
A Distributed Algorithm for Minimum-Weight Spanning Trees

ACM Transactions on Programming Languages and Systems (TOPLAS)
Parallel multilevel k-way partitioning scheme for irregular graphs

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Looking up data in P2P systems

Communications of the ACM
UbiCrawler: a scalable fully distributed web crawler

Software—Practice & Experience
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Kronecker Graphs: An Approach to Modeling Networks

The Journal of Machine Learning Research
Pregel: a system for large-scale graph processing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
X-RIME: Cloud-Based Large Scale Social Network Analysis

SCC '10 Proceedings of the 2010 IEEE International Conference on Services Computing
HAMA: An Efficient Matrix Computation with the MapReduce Framework

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
HipG: parallel processing of large-scale graphs

ACM SIGOPS Operating Systems Review
Designing a common communication subsystem

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Kineograph: taking the pulse of a fast-changing and connected world

Proceedings of the 7th ACM european conference on Computer Systems
Distributed GraphLab: a framework for machine learning and data mining in the cloud

Proceedings of the VLDB Endowment
Distributed Graph Database for Large-Scale Social Computing

CLOUD '12 Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing
The little engine(s) that could: scaling online social networks

IEEE/ACM Transactions on Networking (TON)
PowerGraph: distributed graph-parallel computation on natural graphs

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
GraphChi: large-scale graph computation on just a PC

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Improving large graph processing on partitioned graphs in the cloud

Proceedings of the Third ACM Symposium on Cloud Computing

To 4,000 compute nodes and beyond: network-aware vertex placement in large-scale graph processing systems

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
PAGE: a partition aware graph computation engine

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
PREDIcT: towards predicting the runtime of large scale iterative analytics

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Pregel [23] was recently introduced as a scalable graph mining system that can provide significant performance improvements over traditional MapReduce implementations. Existing implementations focus primarily on graph partitioning as a preprocessing step to balance computation across compute nodes. In this paper, we examine the runtime characteristics of a Pregel system. We show that graph partitioning alone is insufficient for minimizing end-to-end computation. Especially where data is very large or the runtime behavior of the algorithm is unknown, an adaptive approach is needed. To this end, we introduce Mizan, a Pregel system that achieves efficient load balancing to better adapt to changes in computing needs. Unlike known implementations of Pregel, Mizan does not assume any a priori knowledge of the structure of the graph or behavior of the algorithm. Instead, it monitors the runtime characteristics of the system. Mizan then performs efficient fine-grained vertex migration to balance computation and communication. We have fully implemented Mizan; using extensive evaluation we show that---especially for highly-dynamic workloads---Mizan provides up to 84% improvement over techniques leveraging static graph pre-partitioning.