Interference and locality-aware task scheduling for MapReduce applications in virtual clusters

  • Authors:
  • Xiangping Bu;Jia Rao;Cheng-zhong Xu

  • Affiliations:
  • Wayne State University, Detroit, MI, USA;University of Colorado at Colorado Springs, Colorado Springs, CO, USA;Wayne State University, Detroit, MI, USA

  • Venue:
  • Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce emerges as an important distributed programming paradigm for large-scale applications. Running MapReduce applications in clouds presents an attractive usage model for enterprises. In a virtual MapReduce cluster, the interference between virtual machines (VMs) causes performance degradation of map and reduce tasks and renders existing data locality-aware task scheduling policy, like delay scheduling, no longer effective. On the other hand, virtualization offers an extra opportunity of data locality for co-hosted VMs. In this paper, we present a task scheduling strategy to mitigate interference and meanwhile preserving task data locality for MapReduce applications. The strategy includes an interference-aware scheduling policy, based on a task performance prediction model, and an adaptive delay scheduling algorithm for data locality improvement. We implement the interference and locality-aware (ILA) scheduling strategy in a virtual MapReduce framework. We evaluated its effectiveness and efficiency on a 72-node Xen-based virtual cluster. Experimental results with 10 representative CPU and IO-intensive applications show that ILA is able to achieve a speedup of 1.5 to 6.5 times for individual jobs and yield an improvement of up to 1.9 times in system throughput in comparison with four other MapReduce schedulers.