SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Towards optimizing hadoop provisioning in the cloud
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Hadoop++: making a yellow elephant run like a cheetah (without it even noticing)
Proceedings of the VLDB Endowment
CoHadoop: flexible data placement and its exploitation in Hadoop
Proceedings of the VLDB Endowment
Resource provisioning framework for mapreduce jobs with performance goals
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Why do migrations fail and what can we do about it?
LISA'11 Proceedings of the 25th international conference on Large Installation System Administration
Hi-index | 0.00 |
Research agencies and organizations work on algorithms and techniques to reduce Operational and Capital Expenditure. They move to Cloud to transform the Capital Expenditure (Capex) to Operational Expenditure (Opex). They use cloud to crunch large amount of commercial and social data. This paper proposes a heuristic approach to reduce the operational cost of virtual machines (VMs) running Hadoop. The heuristic is simple and effective, it scales the number of Hadoop nodes based on the type and size of the job submitted. We validate our heuristic with Hadoop word-count example on different data samples. Our implementation is independent of the cloud provider. Hence, the heuristic is applicable to both private and public cloud.