PM: An Operating System Coordinated High Performance Communication Library
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Job Scheduling Under the Portable Batch System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Implementing Multiprocessor Scheduling Disciplines
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Implications of I/O for Gang Scheduled Workloads
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Improved Utilization and Responsiveness with Gang Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Overhead Analysis of Preemptive Gang Scheduling
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Implementation of Gang-Scheduling on Workstation Cluster
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
IEEE Transactions on Parallel and Distributed Systems
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Hi-index | 0.00 |
Time sharing between cluster resources in Grid is a major issue in cluster and Grid integration. Classical Grid architecture involves a higher level scheduler which submits non overlapping jobs to the independent batch schedulers of each cluster of the Grid. The sequentiality induced by this approach does not fit with the expected number of users and job heterogeneity of the Grids. Time sharing techniques address this issue by allowing simultaneous executions of many applications on the same resources. Co-scheduling and gang scheduling are the two best known techniques for time sharing cluster resources. Co-scheduling relies on the operating system of each node to schedule the processes of every application. Gang scheduling ensures that the same application is scheduled on all nodes simultaneously. Previous work has proven that co-scheduling techniques outperforms gang scheduling when physical memory is not exhausted. In this paper, we introduce a new hybrid sharing technique providing checkpoint based explicit memory management. It consists in co-scheduling parallel applications within a set, until the memory capacity of the node is reached, and using gang scheduling related techniques to switch from one set to another one. We compare experimentally the merits of the three solutions: Co, Gang and Hybrid Scheduling, in the context of out-of-core computing, which is likely to occur in the Grid context, where many users share the same resources. The experiments show that the hybrid solution is as efficient as the co-scheduling technique when the physical memory is not exhausted, and is more efficient than gang scheduling and co-scheduling when physical memory is exhausted.