Mitigating the negative impact of preemption on heterogeneous MapReduce workloads

Authors:
Lu Cheng;Qi Zhang;Raouf Boutaba
Affiliations:
University of Waterloo, Ontario Canada;University of Waterloo, Ontario Canada;University of Waterloo, Ontario Canada, and Division of IT Convergence Engineering, POSTECH, Pohang, KB, Korea
Venue:
Proceedings of the 7th International Conference on Network and Services Management
Year:
2011

Citing 12
Cited 2

Exploiting process lifetime distributions for dynamic load balancing

ACM Transactions on Computer Systems (TOCS)
On choosing a task assignment policy for a distributed server system

Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
Task assignment with unknown duration

Journal of the ACM (JACM)
Radio Interface System Planning for GSM/Gprs/Umts

Radio Interface System Planning for GSM/Gprs/Umts
Heavy Tails in Multi-Server Queue

Queueing Systems: Theory and Applications
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Quincy: fair scheduling for distributed computing clusters

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

Proceedings of the 5th European conference on Computer systems
Towards characterizing cloud backend workloads: insights from Google compute clusters

ACM SIGMETRICS Performance Evaluation Review
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Reining in the outliers in map-reduce clusters using Mantri

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation

True elasticity in multi-tenant data-intensive compute clusters

Proceedings of the Third ACM Symposium on Cloud Computing
Natjam: design and evaluation of eviction policies for supporting priorities and deadlines in mapreduce clusters

Proceedings of the 4th annual Symposium on Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern production clusters are often shared by multiple types of jobs with different priorities in order to improve resource utilization. Preemption is a common technique employed by MapReduce schedulers to avoid delaying production jobs while allowing the cluster to be shared by other non-production jobs. In addition, it also prevents a large job from occupying too many resources and starving others. Recent literature shows that jobs in production MapReduce clusters have a mixture of lengths and sizes spanning many orders of magnitude. In this type of environments, the current preemption policy used by MapReduce schedulers can significantly delay the completion time of long running tasks, resulting in waste of resources. This paper firstly discusses the heterogeneous nature of MapReduce jobs and their arrival rates in several production clusters. Secondly, we characterize the situations where the current preemption policy causes significant preemption penalty. We then propose a simple mechanism that works in conjunction with existing job schedulers to address this problem. Finally, we evaluate our solution under various types of workloads in Amazon EC2. Experiments show our method can improve system normalized performance by 15% during busy periods by effectively avoiding unnecessary preemption while preserving fairness.