Adaptive Parallel Job Scheduling with Flexible Coscheduling

Authors:
Eitan Frachtenberg;Dror G. Feitelson;Fabrizio Petrini;Juan Fernandez
Affiliations:
IEEE;IEEE;IEEE;IEEE
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2005

Citing 23
Cited 14

A bridging model for parallel computation

Communications of the ACM
A closer look at coscheduling approaches for a network of workstations

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling

IEEE Transactions on Parallel and Distributed Systems
Implicit coscheduling: coordinated scheduling with implicit information in distributed systems

ACM Transactions on Computer Systems (TOCS)
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
The Quadrics Network: High-Performance Clustering Technology

IEEE Micro
Informing Algorithms for Efficient Scheduling of Synchronizing Threads on Multiprogrammed SMPs

ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs

CANPC '98 Proceedings of the Second International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
Implications of I/O for Gang Scheduled Workloads

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Metrics and Benchmarking for Parallel Job Scheduling

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Dynamic Coscheduling on Workstation Clusters

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Selective Reservation Strategies for Backfill Job Scheduling

JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
STORM: lightning-fast resource management

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Adaptive scheduling under memory constraints on non-dedicated computational farms

Future Generation Computer Systems - Selected papers from CCGRID 2002
Scalability Analysis of Multidimensional Wavefront Algorithms on Large-Scale SMP Clusters

FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
A Comparative Evaluation of Implicit Coscheduling Strategies for Networks of Workstations

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Flexible CoScheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Selective Preemption Strategies for Parallel Job Scheduling

ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Paired Gang Scheduling

IEEE Transactions on Parallel and Distributed Systems
The workload on parallel supercomputers: modeling the characteristics of rigid jobs

Journal of Parallel and Distributed Computing
Coscheduling in Clusters: Is It a Viable Alternative?

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q

Proceedings of the 2003 ACM/IEEE conference on Supercomputing

Scheduling Tradeoffs for Heterogeneous Computing on an Advanced Space Processing Platform

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
STORM: Scalable Resource Management for Large-Scale Parallel Computers

IEEE Transactions on Computers
A runtime resolution scheme for priority boost conflict in implicit coscheduling

The Journal of Supercomputing
Cooperating coscheduling: a coscheduling proposal aimed at non-dedicated heterongeneous NOWs

Journal of Computer Science and Technology
Coscheduled distributed-Web servers on system area network

Journal of Parallel and Distributed Computing
New challenges of parallel job scheduling

JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
An approach to resource-aware co-scheduling for CMPs

Proceedings of the 24th ACM International Conference on Supercomputing
Using inaccurate estimates accurately

JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
A network performance sensitivity metric for parallel applications

International Journal of High Performance Computing and Networking
Pitfalls in parallel job scheduling evaluation

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Parallel application-level behavioral attributes for performance and energy management of high-performance computing systems

Cluster Computing
Static and dynamic job scheduling with communication aware policy in cluster computing

Computers and Electrical Engineering
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling

Proceedings of the ACM International Conference on Computing Frontiers
Reducing the energy cost of computing through efficient co-scheduling of parallel workloads

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many scientific and high-performance computing applications consist of multiple processes running on different processors that communicate frequently. Because of their synchronization needs, these applications can suffer severe performance penalties if their processes are not all coscheduled to run together. Two common approaches to coscheduling jobs are batch scheduling, wherein nodes are dedicated for the duration of the run, and gang scheduling, wherein time slicing is coordinated across processors. Both work well when jobs are load-balanced and make use of the entire parallel machine. However, these conditions are rarely met and most realistic workloads consequently suffer from both internal and external fragmentation, in which resources and processors are left idle because jobs cannot be packed with perfect efficiency. This situation leads to reduced utilization and suboptimal performance. Flexible CoScheduling (FCS) addresses this problem by monitoring each job's computation granularity and communication pattern and scheduling jobs based on their synchronization and load-balancing requirements. In particular, jobs that do not require stringent synchronization are identified, and are not coscheduled; instead, these processes are used to reduce fragmentation. FCS has been fully implemented on top of the STORM resource manager on a 256-processor Alpha cluster and compared to batch, gang, and implicit coscheduling algorithms. This paper describes in detail the implementation of FCS and its performance evaluation with a variety of workloads, including large-scale benchmarks, scientific applications, and dynamic workloads. The experimental results show that FCS saturates at higher loads than other algorithms (up to 54 percent higher in some cases), and displays lower response times and slowdown than the other algorithms in nearly all scenarios.