Theory and Practice in Parallel Job Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Phoenix: a parallel programming model for accommodating dynamically joining/leaving resources
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Dynamic process management in an MPI setting
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dynamic Malleability in Iterative MPI Applications
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
The Eucalyptus Open-Source Cloud-Computing System
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Making cloud intermediate data fault-tolerant
Proceedings of the 1st ACM symposium on Cloud computing
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
See spot run: using spot instances for mapreduce workflows
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Breaking the MapReduce Stage Barrier
CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
Clotho: an elastic MapReduce workload/runtime co-design
Proceedings of the 12th International Workshop on Adaptive and Reflective Middleware
Hi-index | 0.00 |
We present the design, implementation, and an evaluation of Elastic Phoenix. Based on the original Phoenix from Stanford, Elastic Phoenix is also a MapReduce implementation for shared-memory systems. The key new feature of Elastic Phoenix is that it supports malleable jobs: the ability add and remove worker processes during the execution of a job. With the original Phoenix, the number of processors to be used is fixed at start-up time. With Elastic Phoenix, if more resources become available (as they might on an elastic cloud computing system), they can be dynamically added to an existing job. If resources are reclaimed, they can also be removed from an existing job. The concept of malleable jobs is well known in job scheduling research, but an implementation of a malleable programming system like Elastic Phoenix is less common. We show how dynamically increasing the resources available to an Elastic Phoenix workload as it runs can reduce response time by 29% compared to a statically resourced workload. We detail the changes to the Phoenix application programming interface (API) made to support the new capability, and discuss the implementation changes to the Phoenix code base. We show that any additional run-time overheads introduced by Elastic Phoenix can be offset by the benefits of dynamically adding processor resources.