Utilization of map-reduce for parallelization of resource scheduling using MPI: PRS
Proceedings of the 2011 International Conference on Communication, Computing & Security
A Load-Driven Task Scheduler with Adaptive DSC for MapReduce
GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
Benchmarking MapReduce Implementations for Application Usage Scenarios
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
MARLA: MapReduce for Heterogeneous Clusters
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
CASH: context aware scheduler for Hadoop
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Enabling fair pricing on HPC systems with node sharing
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
HAT: history-based auto-tuning MapReduce in heterogeneous environments
The Journal of Supercomputing
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
A cloud-based intelligent TV program recommendation system
Computers and Electrical Engineering
Clotho: an elastic MapReduce workload/runtime co-design
Proceedings of the 12th International Workshop on Adaptive and Reflective Middleware
Hi-index | 0.00 |
MapReduce is an important programming model for building data centers containing ten of thousands of nodes. In a practical data center of that scale, it is a common case that I/O-bound jobs and CPU-bound jobs, which demand different resources, run simultaneously in the same cluster. In the MapReduce framework, parallelization of these two kinds of job has not been concerned. In this paper, we give a new view of the MapReduce model, and classify the MapReduce workloads into three categories based on their CPU and I/O utilization. With workload classification, we design a new dynamic MapReduce workload predict mechanism, MR-Predict, which detects the workload type on the fly. We propose a Triple-Queue Scheduler based on the MR-Predict mechanism. The Triple-Queue scheduler could improve the usage of both CPU and disk I/O resources under heterogeneous workloads. And it could improve the Hadoop throughput by about 30% under heterogeneous workloads.