A Load-Driven Task Scheduler with Adaptive DSC for MapReduce

  • Authors:
  • Hong Mao;Shengqiu Hu;Zhenzhong Zhang;Limin Xiao;Li Ruan

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapid development of internet applications, more and more network service and commercial applications are deployed to cloud computing environment, with petabytes of data to be processed. MapReduce is one of the most famous solutions for large-scale data processing. This paper focuses on optimizing the scheduler of MapReduce framework in task level. We care about the hardware configuration and real-time workload of the nodes in a hadoop cluster and aim at shortening time cost of MapReduce jobs and improving hardware resource utilization rate. We put forward a load-driven task scheduler which assigns tasks to Task Trackers according to the workload of slave nodes. It is based on a Dynamic Slot Controller (DSC) that can adjust Map task Slot (MS) and Reduce task Slot (RS) of Task Trackers running on slave nodes adaptively. Our load-driven task scheduler can shorten time consumption of MapReduce job by 14% and improve the CPU utilization rate of hadoop cluster by 34% when processing 10GB data.