An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration

Authors:
Yanyong Zhang;Hubertus Franke;Jose Moreira;Anand Sivasubramaniam
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2003

Citing 18
Cited 28

Load balancing and fault tolerance in workstation clusters migrating groups of communicating processes

ACM SIGOPS Operating Systems Review
Comparing processor allocation strategies in multiprogrammed shared-memory multiprocessors

Journal of Parallel and Distributed Computing
An infrastructure for efficient parallel job execution in Terascale computing environments

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
A Gang-Scheduling System for ASCI Blue-Pacific

HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
The ANL/IBM SP Scheduling System

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Managing Checkpoints for Parallel Programs

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
The EASY - LoadLeveler API Project

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Modeling of Workload in MPPs

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A Historical Application Profiler for Use by Parallel Schedulers

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Improved Utilization and Responsiveness with Gang Scheduling

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Using Queue Time Predictions for Processor Allocation

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Implementing the Combination of Time Sharing and Space Sharing on AP/Linux

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Improving First-Come-First-Serve Job Scheduling by Gang Scheduling

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Gang scheduling for highly efficient, distributed multiprocessor systems

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
A Simulation - Based Performance Analysis of Gang Scheduling in a Distributed System

SS '99 Proceedings of the Thirty-Second Annual Simulation Symposium
Extensible Resource Management For Cluster Computing

ICDCS '97 Proceedings of the 17th International Conference on Distributed Computing Systems (ICDCS '97)
Utilization and Predictability in Scheduling the IBM SP2 with Backfilling

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

Self-Adaptive Scheduler Parameterization via Online Simulation

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters

The Journal of Supercomputing
Power-aware resource allocation in high-end systems via online simulation

Proceedings of the 19th annual international conference on Supercomputing
Closed form solutions for mapping general distributions to quasi-minimal PH distributions

Performance Evaluation - Modelling techniques and tools for computer performance evaluation
Design and Performance Evaluation of Queue-and-Rate-Adjustment Dynamic Load Balancing Policies for Distributed Networks

IEEE Transactions on Computers
Simulation of job scheduling for small scale clusters

Proceedings of the 38th conference on Winter simulation
A runtime resolution scheme for priority boost conflict in implicit coscheduling

The Journal of Supercomputing
Backfilling Using System-Generated Predictions Rather than User Runtime Estimates

IEEE Transactions on Parallel and Distributed Systems
Detection workload in a dynamic grid-based intrusion detection environment

Journal of Parallel and Distributed Computing
The Impact of Critical Sporadic Jobs on Gang Scheduling Performance in Distributed Systems

Simulation
A novel distributed architecture of large-scale multimedia storage system using autonomous object-based storage devices

Journal of Parallel and Distributed Computing
Load balancing on speed

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
An experimental approach for job scheduling optimizationto improve the system usage efficiency

PDCN '08 Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks
Safety scheduling strategies in distributed computing

International Journal of Critical Computer-Based Systems
Performance evaluation of bag of gangs scheduling in a heterogeneous distributed system

Journal of Systems and Software
Using inaccurate estimates accurately

JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
On/off-line prediction applied to job scheduling on non-dedicated NOWs

Journal of Computer Science and Technology - Special issue on natural language processing
Service control with the preemptive parallel job scheduler Scojo-PECT

Cluster Computing
Gang scheduling in multi-core clusters implementing migrations

Future Generation Computer Systems
Gang scheduling in a two-cluster system implementing migrations and periodic feedback

Simulation
Job failures in high performance computing systems: A large-scale empirical study

Computers & Mathematics with Applications
Parallel job scheduling — a status report

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Performance implications of failures in large-scale cluster scheduling

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
CISNE: a new integral approach for scheduling parallel applications on non-dedicated clusters

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Interference-driven resource management for GPU-based heterogeneous clusters

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Scheduling of frequently communicating tasks

International Journal of Communication Systems
Multicriteria scheduling strategies in scalable computing systems

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Improving CompactMatrix phase in gang scheduling by changing transference condition and utilizing exchange

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Effective scheduling strategies to improve response times, throughput, and utilization are an important consideration in large supercomputing environments. Parallel machines in these environments have traditionally used space-sharing strategies to accommodate multiple jobs at the same time by dedicating the nodes to a single job until it completes. This approach, however, can result in low system utilization and large job wait times. This paper discusses three techniques that can be used beyond simple space-sharing to improve the performance of large parallel systems. The first technique we analyze is backfilling, the second is gang-scheduling, and the third is migration. The main contribution of this paper is an analysis of the effects of combining the above techniques. Using extensive simulations based on detailed models of realistic workloads, the benefits of combining the various techniques are shown over a spectrum of performance criteria.