MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Hadoop: The Definitive Guide
Towards automatic optimization of MapReduce programs
Proceedings of the 1st ACM symposium on Cloud computing
X-RIME: Cloud-Based Large Scale Social Network Analysis
SCC '10 Proceedings of the 2010 IEEE International Conference on Services Computing
Towards optimizing hadoop provisioning in the cloud
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Hi-index | 0.00 |
In this paper, we address the problem caused by fixed assignment of task slots in Hadoop MapReduce. It is infeasible to manually configure optimal task slots since the characteristics of various workloads are different. We design and implement an automatic control mechanism to dynamically assign task slots based on the resource utilization on each Task Tracker node. The assignment takes the lag period into account. It can improve the cluster-wide resource utilization and avoid contention. Experimental results show that our implementation can dynamically adjust the task slots capacity to the optimal setting in runtime. In some case such as Word Count, our control mechanism outperforms the current Hadoop with optimal task slots configuration found by manual tuning.