Scheduling User-Level Threads on Distributed Shared-Memory Multiprocessors

Authors:
Eleftherios D. Polychronopoulos;Theodore S. Papatheodorou
Affiliations:
-;-
Venue:
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Year:
1999

Citing 5
Cited 1

Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Using processor affinity in loop scheduling on shared-memory multiprocessors

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
On the implementation and effectiveness of autoscheduling for shared-memory multiprocessors

On the implementation and effectiveness of autoscheduling for shared-memory multiprocessors
A Library Implementation of the Nano-Threads Programming Model

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II

Nanothreads vs. Fibers for the Support of Fine Grain Parallelism on Windows NT/2000 Platforms

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present Dynamic Bisectioning or DBS, a simple but powerful comprehensive scheduling policy for user-level threads, which unifies the exploitation of (multidimensional) loop and nested functional (or task) parallelism. Unlike other schemes that have been proposed and used thus far, DBS is not constrained to scheduling DAGs or singly nested parallel loops. Rather, our policy encompasses the most general type of parallel program model that allows arbitrary mix of nested loops and nested DAGs (directed acyclic task-graphs) or any combination of the above. DBS employs a simple but powerful two-level dynamic policy which is adaptive and sensitive to the type and amount of parallelism at hand. On one extreme DBS approximates static scheduling, hence facilitating locality of data, while at the other extreme it resorts to dynamic thread migration in order to balance uneven loads. Even the latter is done in a controlled way so as to minimize network latency.