A load-aware scheduler for MapReduce framework in heterogeneous cloud environments

Authors:
Hsin-Han You;Chun-Chung Yang;Jiun-Long Huang
Affiliations:
National Chiao Tung University, Hsinchu, Taiwan, ROC;National Chiao Tung University, Hsinchu, Taiwan, ROC;National Chiao Tung University, Hsinchu, Taiwan, ROC
Venue:
Proceedings of the 2011 ACM Symposium on Applied Computing
Year:
2011

Citing 8
Cited 3

MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
DisCo: Distributed Co-clustering with Map-Reduce: A Case Study towards Petabyte-Scale End-to-End Mining

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Hive: a warehousing solution over a map-reduce framework

Proceedings of the VLDB Endowment
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads

Proceedings of the VLDB Endowment
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation

Utility-Driven share scheduling algorithm in hadoop

ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part II
Game-based scheduling algorithm to achieve optimize profit in mapreduce environment

ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories
SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

MapReduce is becoming a popular programming model for large-scale data processing in cloud computing environments. Hadoop MapReduce is the most popular open-source implementation of MapReduce framework. Hadoop MapReduce comes with a pluggable task scheduler interface as well as a default FIFO job scheduler. The default Hadoop scheduler only considers the homogeneous environments, and thus does not perform well in heterogenous environments. Although being proposed to schedule tasks/jobs in heterogenous environments, the LATE scheduler does not consider the phenomenon of dynamic loading which is common in practice. In view of this, we propose a new scheduler named Load-Aware scheduler, abbreviated as the LA scheduler, to address the problem resulting from the phenomenon of dynamic loading, thus being able to improve the overall performance of Hadoop clusters. Experimental results show that the LA scheduler is able to reduce up to 20% in average response time by avoiding unnecessary speculative tasks.