Automatic task slots assignment in Hadoop MapReduce

  • Authors:
  • Kun Wang;Ben Tan;Juwei Shi;Bo Yang

  • Affiliations:
  • Peking University;IBM Research - China;IBM Research - China;IBM Research - China

  • Venue:
  • Proceedings of the 1st Workshop on Architectures and Systems for Big Data
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we address the problem caused by fixed assignment of task slots in Hadoop MapReduce. It is infeasible to manually configure optimal task slots since the characteristics of various workloads are different. We design and implement an automatic control mechanism to dynamically assign task slots based on the resource utilization on each Task Tracker node. The assignment takes the lag period into account. It can improve the cluster-wide resource utilization and avoid contention. Experimental results show that our implementation can dynamically adjust the task slots capacity to the optimal setting in runtime. In some case such as Word Count, our control mechanism outperforms the current Hadoop with optimal task slots configuration found by manual tuning.