Memory and Network Bandwidth Aware Scheduling of Multiprogrammed Workloads on Clusters of SMPs

Authors:
Evangelos Koukis;Nectarios Koziris
Affiliations:
National Technical University of Athens, Greece;National Technical University of Athens, Greece
Venue:
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Year:
2006

Citing 12
Cited 4

The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors

Journal of Parallel and Distributed Computing
Performance analysis using the MIPS R10000 performance counters

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Performance characteristics of gang scheduling in multiprogrammed environments

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Preliminary thoughts on memory-bus scheduling

EW 9 Proceedings of the 9th workshop on ACM SIGOPS European workshop: beyond the PC: new challenges for the operating system
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling

IEEE Transactions on Parallel and Distributed Systems
Effects of Memory Performance on Parallel Job Scheduling

JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Implementation of Gang-Scheduling on Workstation Cluster

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Memory Bandwidth Aware Scheduling for SMP Cluster Nodes

PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing

On mitigating memory bandwidth contention through bandwidth-aware scheduling

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
GMBlock: Optimizing data movement in a block-level storage sharing system over Myrinet

Cluster Computing
Providing fairness on shared-memory multiprocessors via process scheduling

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Improving inter-node communications in multi-core clusters using a contention-free process mapping algorithm

The Journal of Supercomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Symmetric Multiprocessors (SMPs), combined with modern interconnection technologies are commonly used to build cost-effective compute clusters. However, contention among processors for access to shared resources, as is the main memory bus and the NIC can limit their efficiency significantly. In this paper, we first provide an experimental demonstration of the effect of resource contention on the total execution time of applications. Then, we present the design and implementation of an informed gang-like scheduling algorithm aimed at improving the throughput of multiprogrammed workloads on clusters of SMPs. Our algorithm selects the processes to be coscheduled so as not to saturate nor underutilize the memory bus or network link bandwidth. Its input data are acquired dynamically using hardware monitoring counters and a modified Myrinet NIC firmware, without any modifications to existing application binaries. Experimental evaluation shows throughput can improve up to 40-48% compared to the standard Linux 2.6 O(1) scheduler.