Effective distributed scheduling of parallel workloads
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A closer look at coscheduling approaches for a network of workstations
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
An evaluation of parallel job scheduling for ASCI Blue-Pacific
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Alternatives to coscheduling a network of workstations
Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Impact of Workload and System Parameters on Next Generation Cluster Scheduling Mechanisms
IEEE Transactions on Parallel and Distributed Systems
Highly efficient gang scheduling implementation
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Performance characteristics of gang scheduling in multiprogrammed environments
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
CSIM19: CSIM19: a powerful tool for building system models
Proceedings of the 33nd conference on Winter simulation
A Case for NOW (Networks of Workstations)
IEEE Micro
The Virtual Interface Architecture
IEEE Micro
User-Level Communication in a System with Gang Scheduling
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Dynamic Coscheduling on Workstation Clusters
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration
IEEE Transactions on Parallel and Distributed Systems
A Comparative Evaluation of Implicit Coscheduling Strategies for Networks of Workstations
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Self-Adapting Backfilling Scheduling for Parallel Systems
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Coscheduling in Clusters: Is It a Viable Alternative?
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Adaptive Parallel Job Scheduling with Flexible Coscheduling
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
High-performance parallel and scientific applications are composed of multiple processes running on distinct CPUs that communicate frequently. Due to the synchronization needs of such applications, performance is greatly hampered if their processes are not scheduled simultaneously on the CPUs. Implicit coscheduling (ICS) is a well-known technique to address this problem in multi-programmed clusters, however, traditional ICS schemes do not incorporate steps to adequately deal with priority boost conflicts, leading to significantly degraded performance. In this paper, we propose the use of runtime difference in contention across nodes to provide more sophisticated coscheduling decisions in response to the conflicts. We also present a novel coscheduling scheme termed PROC (Process ReOrdering-based Coscheduling) that adaptively regulates the scheduling sequence of conflicting processes based on the rescheduling latency of their correspondents in remote nodes. We perform extensive simulation-based experiments using both synthetic and realistic workloads to analyze the performance of PROC compared to alternatives such as local scheduling, a widely used batch scheduling, gang scheduling, and existing ICS schemes. The results show that all ICS schemes commonly experience priority boost conflicts, and that the proposed PROC significantly outperforms other ICS alternatives (or batch scheduling) by up to 50.4% (or 72.5%) in the average job response time. This improvement is achieved by reducing wasted idle time and spinning time without sacrificing fairness.