Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
An Evaluation of Server Consolidation Workloads for Multi-Core Designs
IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Hadoop: The Definitive Guide
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
Virtual machine power metering and provisioning
Proceedings of the 1st ACM symposium on Cloud computing
Energy aware consolidation for cloud computing
HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
The Hadoop Distributed File System
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Runtime measurements in the cloud: observing, analyzing, and reducing variance
Proceedings of the VLDB Endowment
Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Energy-Aware Workload Consolidation on GPU
ICPPW '11 Proceedings of the 2011 40th International Conference on Parallel Processing Workshops
Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors
IEEE Transactions on Computers
Hi-index | 0.00 |
Workload consolidation, sharing physicalresources among multiple workloads, is a promisingtechnique to save cost and energy in cluster computingsystems. This paper highlights a number of challengesassociated with workload consolidation for Hadoop, as oneof the current state-of-the-art data-intensive clustercomputing systems. Through a systematic step-by-stepprocedure, we investigate challenges for efficient serverconsolidation in Hadoop environments. To this end, we firstinvestigate the inter-relationship between last level cache(LLC) contention and throughput degradation forconsolidated workloads on a single physical serveremploying Hadoop distributed file system (HDFS). We theninvestigate the general case of consolidation on multiplephysical servers so that their throughput never falls below adesired/predefined utilization level. We use our empiricalresults to model consolidation as a classic two-dimensionalbin packing problem and then design a computationallyefficient greedy algorithm to achieve minimum throughputdegradation on multiple servers. Results are very promisingand show that our greedy approach is able to achieve nearoptimal solutions in all experimented cases.