Impact of Memory Contention on Dynamic Scheduling on NUMA Multiprocessors

Authors:
Dannie Durand;Thierry Montaut;Lionel Kervella;William Jalby
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1996

Citing 13
Cited 4

Allocating Independent Subtasks on Parallel Processors

IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Determining average program execution times and their variance

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Experimentally Characterizing the Behavior of Multiprocessor Memory Systems: A Case Study

IEEE Transactions on Software Engineering
Hector: A Hierarchically Structured Shared-Memory Multiprocessor

Computer - Special issue on experimental research in computer architecture
Factoring: a practical and robust method for scheduling parallel loops

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Low-overhead scheduling of nested parallelism

IBM Journal of Research and Development
Using processor affinity in loop scheduling on shared-memory multiprocessors

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A static parameter based performance prediction tool for parallel programs

ICS '93 Proceedings of the 7th international conference on Supercomputing
Performance evaluation and prediction for parallel algorithms on the BBN GP1000

ICS '90 Proceedings of the 4th international conference on Supercomputing
The directory-based cache coherence protocol for the DASH multiprocessor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Performance of Synchronous Parallel Algorithms with Regular Structures

IEEE Transactions on Parallel and Distributed Systems
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

IEEE Transactions on Parallel and Distributed Systems

Dynamic scheduling with incomplete information

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Performance Metrics for Embedded Parallel Pipelines

IEEE Transactions on Parallel and Distributed Systems
Building portable thread schedulers for hierarchical multiprocessors: the bubblesched framework

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Improving execution unit occupancy on SMT-based processors through hardware-aware thread scheduling

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Self-scheduling is a method for task scheduling in parallel programs, in which each processor acquires a new block of tasks for execution whenever it becomes idle. To get the best performance, the block size must be chosen to balance the scheduling overhead against the load imbalance. To determine the best block size, a better understanding of the role of load imbalance in self-scheduling performance is needed.In this paper we study the effect of memory contention on task duration distributions and, hence, load balancing in self-scheduling on a Nonuniform Memory Access (NUMA) machine. Experimental studies on a BBN TC2000 are used to reveal the strengths and weaknesses of analytical performance models to predict running time and optimal block size. The models are shown to be very accurate for small block sizes. However, the models fail when the block size is large due to a previously unrecognized source of load imbalance. We extend the analytical models to address this failure. The implications for the construction of compilers and runtime systems are discussed.