A simulation-based study of scheduling mechanisms for a dynamic cluster environment

Authors:
Yanyong Zhang;Anand Sivasubramaniam;Jose Moreira;Hubertus Franke
Affiliations:
Department of Computer Science & Engineering, The Pennsylvania State University, University Park, PA;Department of Computer Science & Engineering, The Pennsylvania State University, University Park, PA;IBM T. J. Watson Research Center, P. O. Box 218, Yorktown Heights, NY;IBM T. J. Watson Research Center, P. O. Box 218, Yorktown Heights, NY
Venue:
Proceedings of the 14th international conference on Supercomputing
Year:
2000

Citing 13
Cited 7

U-Net: a user-level network interface for parallel and distributed computing

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Effective distributed scheduling of parallel workloads

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Scheduling with implicit information in distributed systems

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
A closer look at coscheduling approaches for a network of workstations

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
An evaluation of parallel job scheduling for ASCI Blue-Pacific

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Alternatives to coscheduling a network of workstations

Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
A Gang-Scheduling System for ASCI Blue-Pacific

HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
The ANL/IBM SP Scheduling System

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Global State Detection Using Network Preemption

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors

Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors
Coordinated thread scheduling for workstation clusters under windows NT

NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997

Scheduling best-effort and real-time pipelined applications on time-shared clusters

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Modeling and analysis of dynamic coscheduling in parallel and distributed environments

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Performance Comparison of Coscheduling Strategies for Workstation Clusters

Cluster Computing
LOMARC: Lookahead Matchmaking for Multiresource Coscheduling on Hyperthreaded CPUs

IEEE Transactions on Parallel and Distributed Systems
Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms

Proceedings of the 3rd international conference on Virtual execution environments
LOMARC — lookahead matchmaking for multi-resource coscheduling

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
A job scheduling approach for multi-core clusters based on virtual malleability

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scheduling of processes onto processors of a parallel machine has always been an important and challenging area of research. The issue becomes even more crucial and difficult as we gradually progress to the use of off-the-shelf workstations, operating systems, and high bandwidth networks to build cost-effective clusters for demanding applications. Clusters are gaining acceptance not just in scientific applications that need supercomputing power, but also in domains such as databases, web service and multimedia, which place diverse Quality-of-Service (QoS) demands on the underlying system. Further, these applications have diverse characteristics in terms of their computation, communication and I/O requirements, making conventional parallel scheduling solutions, such as space sharing or coscheduling, an unattractive option. At the same time, leaving it to the native operating system of each node to make decisions independently can lead to ineffective use of system resources whenever there is communication. Instead, an emerging class of dynamic coscheduling mechanisms, that attempt to take remedial actions to guide the system towards coscheduled execution without requiring explicit synchronization, offer a lot of promise for cluster scheduling. Using a detailed simulator, this paper evaluates the pros and cons of different dynamic coscheduling alternatives, while comparing their advantages over traditional coscheduling (and not performing any coordinated scheduling at all). The impact of dynamic job arrivals, job characteristics and different system parameters on these alternatives are evaluated in terms of several performance criteria.