Meeting service level objectives of Pig programs
Proceedings of the 2nd International Workshop on Cloud Computing Platforms
Optimizing Completion Time and Resource Provisioning of Pig Programs
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
CloudVista: interactive and economical visual cluster analysis for big data in the cloud
Proceedings of the VLDB Endowment
Automated profiling and resource management of pig programs for meeting service level objectives
Proceedings of the 9th international conference on Autonomic computing
Bridging the tenant-provider gap in cloud services
Proceedings of the Third ACM Symposium on Cloud Computing
Minimizing Cost of Virtual Machines for Deadline-Constrained MapReduce Applications in the Cloud
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Benchmarking approach for designing a mapreduce performance model
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Performance Modeling and Optimization of Deadline-Driven Pig Programs
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Optimization strategies for A/B testing on HADOOP
Proceedings of the VLDB Endowment
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Running MapReduce programs in the public cloud introduces the important problem: how to optimize resource provisioning to minimize the financial charge for a specific job? In this paper, we study the whole process of MapReduce processing and build up a cost function that explicitly models the relationship between the amount of input data, the available system resources (Map and Reduce slots), and the complexity of the Reduce function for the target MapReduce job. The model parameters can be learned from test runs with a small number of nodes. Based on this cost model, we can solve a number of decision problems, such as the optimal amount of resources that can minimize the financial cost with a time deadline or minimize the time under certain financial budget. Experimental results show that this cost model performs well on tested MapReduce programs.