Tarazu: optimizing MapReduce on heterogeneous clusters

Authors:
Faraz Ahmad;Srimat T. Chakradhar;Anand Raghunathan;T. N. Vijaykumar
Affiliations:
Purdue University, West Lafayette, IN, USA;NEC Laboratories America, Princeton, NJ, USA;Purdue University, West Lafayette, IN, USA;Purdue University, West Lafayette, IN, USA
Venue:
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Year:
2012

Citing 25
Cited 9

High-performance sorting on networks of workstations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Clustering Algorithms

Clustering Algorithms
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Map-reduce-merge: simplified relational data processing on large clusters

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Merge: a programming model for heterogeneous multi-core systems

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
VL2: a scalable and flexible data center network

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
FAWN: a fast array of wimpy nodes

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Quincy: fair scheduling for distributed computing clusters

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

Proceedings of the 5th European conference on Computer systems
Web search using mobile cores: quantifying and mitigating the price of efficiency

Proceedings of the 37th annual international symposium on Computer architecture
Scale-Out Networking in the Data Center

IEEE Micro
DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Wimpy node clusters: what about non-wimpy workloads?

Proceedings of the Sixth International Workshop on Data Management on New Hardware
SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment

CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
Reining in the outliers in map-reduce clusters using Mantri

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
NapSAC: design and implementation of a power-proportional web cluster

ACM SIGCOMM Computer Communication Review
Dominant resource fairness: fair allocation of multiple resource types

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Power management for heterogeneous clusters: An experimental study

IGCC '11 Proceedings of the 2011 International Green Computing Conference and Workshops

Hierarchical merge for scalable MapReduce

Proceedings of the 2012 workshop on Management of big data systems
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling

ACM Transactions on Architecture and Code Optimization (TACO)
Benchmarking approach for designing a mapreduce performance model

Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Whare-map: heterogeneity in "homogeneous" warehouse-scale computers

Proceedings of the 40th Annual International Symposium on Computer Architecture
MapReduce with communication overlap (MaRCO)

Journal of Parallel and Distributed Computing
CooMR: cross-task coordination for efficient data management in MapReduce programs

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
PIKACHU: how to rebalance load in optimizing mapreduce on heterogeneous clusters

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Quasar: resource-efficient and QoS-aware cluster management

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
MapReduce "garbage" collection

CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data center-scale clusters are evolving towards heterogeneous hardware for power, cost, differentiated price-performance, and other reasons. MapReduce is a well-known programming model to process large amount of data on data center-scale clusters. Most MapReduce implementations have been designed and optimized for homogeneous clusters. Unfortunately, these implementations perform poorly on heterogeneous clusters (e.g., on a 90-node cluster that contains 10 Xeon-based servers and 80 Atom-based servers, Hadoop performs worse than on 10-node Xeon-only or 80-node Atom-only homogeneous sub-clusters for many of our benchmarks). This poor performance remains despite previously proposed optimizations related to management of straggler tasks. In this paper, we address MapReduce's poor performance on heterogeneous clusters. Our first contribution is that the poor performance is due to two key factors: (1) the non-intuitive effect that MapReduce's built-in load balancing results in excessive and bursty network communication during the Map phase, and (2) the intuitive effect that the heterogeneity amplifies load imbalance in the Reduce computation. Our second contribution is Tarazu, a suite of optimizations to improve MapReduce performance on heterogeneous clusters. Tarazu consists of (1) Communication-Aware Load Balancing of Map computation (CALB) across the nodes, (2) Communication-Aware Scheduling of Map computation (CAS) to avoid bursty network traffic and (3) Predictive Load Balancing of Reduce computation (PLB) across the nodes. Using the above 90-node cluster, we show that Tarazu significantly improves performance over a baseline of Hadoop with straightforward tuning for hardware heterogeneity.