Modeling and analysis of dynamic coscheduling in parallel and distributed environments

Authors:
Mark S. Squillante;Yanyong Zhang;Anand Sivasubramaniam;Natarajan Gautam;Hubertus Franke;Jose Moreira
Affiliations:
Thomas J. Watson Research Center, Yorktown Heights, NY;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Thomas J. Watson Research Center, Yorktown Heights, NY;Thomas J. Watson Research Center, Yorktown Heights, NY
Venue:
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Year:
2002

Citing 12
Cited 10

Effective distributed scheduling of parallel workloads

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Stochastic analysis of gang scheduling in parallel and distributed systems

Performance Evaluation
Scheduling with implicit information in distributed systems

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Fitting mixtures of exponentials to long-tail distributions to analyze network performance models

Performance Evaluation
A closer look at coscheduling approaches for a network of workstations

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
A simulation-based study of scheduling mechanisms for a dynamic cluster environment

Proceedings of the 14th international conference on Supercomputing
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Workload Evolution on the Cornell Theory Center IBM SP2

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Dynamic Partitioning in Different Distributed-Memory Environments

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Comparative Evaluation of Implicit Coscheduling Strategies for Networks of Workstations

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors

Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors

Task scheduling performance in distributed systems with time varying workload

Neural, Parallel & Scientific Computations
Coscheduling in Clusters: Is It a Viable Alternative?

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Performance Comparison of Coscheduling Algorithms for Non-Dedicated Clusters Through a Generic Framework

International Journal of High Performance Computing Applications
A comprehensive performance and energy consumption analysis of scheduling alternatives in clusters

The Journal of Supercomputing
Stochastic analysis of multiserver systems

ACM SIGMETRICS Performance Evaluation Review
Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms

Proceedings of the 3rd international conference on Virtual execution environments
Coscheduled distributed-Web servers on system area network

Journal of Parallel and Distributed Computing
Performance implications of virtualizing multicore cluster machines

Proceedings of the 2nd workshop on System-level virtualization for high performance computing
Performance implications of failures in large-scale cluster scheduling

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Pitfalls in parallel job scheduling evaluation

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scheduling in large-scale parallel systems has been and continues to be an important and challenging research problem. Several key factors, including the increasing use of off-the-shelf clusters of workstations to build such parallel systems, have resulted in the emergence of a new class of scheduling strategies, broadly referred to as dynamic coscheduling. Unfortunately, the size of both the design and performance spaces of these emerging scheduling strategies is quite large, due in part to the numerous dynamic interactions among the different components of the parallel computing environment as well as the wide range of applications and systems that can comprise the parallel environment. This in turn makes it difficult to fully explore the benefits and limitations of the various proposed dynamic coscheduling approaches for large-scale systems solely with the use of simulation and/or experimentation.To gain a better understanding of the fundamental properties of different dynamic coscheduling methods, we formulate a general mathematical model of this class of scheduling strategies within a unified framework that allows us to investigate a wide range of parallel environments. We derive a matrix-analytic analysis based on a stochastic decomposition and a fixed-point iteration. A large number of numerical experiments are performed in part to examine the accuracy of our approach. These numerical results are in excellent agreement with detailed simulation results. Our mathematical model and analysis is then used to explore several fundamental design and performance tradeoffs associated with the class of dynamic coscheduling policies across a broad spectrum of parallel computing environments.