A dynamic scheduling method for irregular parallel programs

Authors:
Steven Lucco
Affiliations:
-
Venue:
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Year:
1992

Citing 17
Cited 20

Allocating Independent Subtasks on Parallel Processors

IEEE Transactions on Software Engineering
Interprocedural constant propagation

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Parallel programming in a virtual object space

OOPSLA '87 Conference proceedings on Object-oriented programming systems, languages and applications
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
An overview for the PTRAN analysis system for multiprocessing

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Partitioning programs for parallel execution

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Determining average program execution times and their variance

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
An efficient method of computing static single assignment form

POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Parallel programming with coordination structures

POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Delirium: an embedding coordination language

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A simple load balancing scheme for task allocation in parallel machines

SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Partitioning parallel programs for macro-dataflow

LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
Automatic loop interchange

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Optimizing supercompilers for supercomputers

Optimizing supercompilers for supercomputers

Orchestrating interactions among parallel computations

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Combining static and dynamic scheduling on distributed-memory multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Controlling application grain size on a network of workstations

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Load-sharing in heterogeneous systems via weighted factoring

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems

IEEE Transactions on Parallel and Distributed Systems
Dynamic scheduling with incomplete information

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Dynamic Task Scheduling Using Online Optimization

IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Feedback Guided Scheduling of Nested Loops

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Scheduling at Twilight the Easy Way

STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Feedback guided dynamic loop scheduling: convergence of the continuous case

The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization

IEEE Transactions on Computers
Memory bank aware dynamic loop scheduling

Proceedings of the conference on Design, automation and test in Europe
Provably efficient two-level adaptive scheduling

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
Is the schedule clause really necessary in OpenMP?

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Enhanced loop coalescing: a compiler technique for transforming non-uniform iteration spaces

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
A new carried-dependence self-scheduling algorithm

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and its Applications - Volume Part I
Convergence of the discrete FGDLS algorithm

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Partitioning and scheduling loops on NOWs

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper develops a methodology for compiling and executing irregular parallel programs. Such programs implement parallel operations whose size and work distribution depend on input data. We show a fundamental relationship between three quantities that characterize an irregular parallel computation: the total available parallelism, the optimal grain size, and the statistical variance of execution times for individual tasks. This relationship yields a dynamic scheduling algorithm that substantially reduces the overhead of executing irregular parallel operations.We incorporated this algorithm into an extended Fortran compiler. The compiler accepts as input a subset of Fortran D which includes blocked and cyclic decompositions and perfect alignment; it outputs Fortran 77 augmented with calls to library routines written in C. For irregular parallel operations, the compiled code gathers information about available parallelism and task execution time variance and uses this information to schedule the operation. On distributed memory architectures, the compiler encodes information about data access patterns for the runtime scheduling system so that it can preserve communication locality.We evaluated these compilation techniques using a set of application programs including climate modeling, circuit simulation, and x-ray tomography, that contain irregular parallel operations. The results demonstrate that, for these applications, the dynamic techniques described here achieve near-optimal efficiency on large numbers of processors. In addition, they perform significantly better, on these problems, than any previously proposed static or dynamic scheduling algorithm.