Another approach to backfilled jobs: applying virtual malleability to expired windows
Proceedings of the 19th annual international conference on Supercomputing
A simulator for adaptive parallel applications
Journal of Computer and System Sciences
A dynamic scheduler for balancing HPC applications
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Resource Allocation Using Virtual Clusters
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Dynamic load balancing in MPI jobs
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
A simulator for parallel applications with dynamically varying compute node allocation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Supporting malleability in parallel architectures with dynamic CPUSETs mapping and dynamic MPI
ICDCN'10 Proceedings of the 11th international conference on Distributed computing and networking
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Malleable Model Coupling with Prediction
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A job scheduling approach for multi-core clusters based on virtual malleability
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
Parallel jobs are characterized for having processes that communicate and synchronize with each other frequently. A processor allocation strategy widely used in parallel supercomputers is Space-Sharing, that is assigning a processors partition to each job for its exclusive use. In this article we present a global solution to offer virtual Malleability on message-passing parallel jobs, by applying a processor allocation strategy, the Folding by JobType (FJT). This technique is based on Folding and Moldability concepts and tries to decide the optimal initial number of processes, when to fold jobs and the number of folding times by analyzing the current and past system information. At processor level, we apply Co-Scheduling. We implement and evaluate the FJT under several workloads with different job sizes, classes and machine utilization. Results show that the FJT adapts easily to load changes, and can obtain better performance than the rest evaluated, on workloads with high coefficient variation and especially with burst arrivals.