Dynamic Processor Self-Scheduling for General Parallel Nested Loops

Authors:
Zhixi Fang;Peiyi Tang;Pen-Chung Yew;Chuan-Qi Zhu
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Computers
Year:
1990

Citing 9
Cited 15

Allocating Independent Subtasks on Parallel Processors

IEEE Transactions on Software Engineering
A Scheme to Enforce Data Dependence on Large Multiprocessor Systems

IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Automatic decomposition of scientific programs for parallel execution

POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Measuring Parallelism in Computation-Intensive Scientific/Engineering Applications

IEEE Transactions on Computers
A framework for determining useful parallelism

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Impact of self-scheduling order on performance on multiprocessor systems

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Self-scheduling, data synchronization and program transformation for multiprocessor systems

Self-scheduling, data synchronization and program transformation for multiprocessor systems
Operating system data structures for shared memory mimd machines with fetch-and-add

Operating system data structures for shared memory mimd machines with fetch-and-add

Low-overhead scheduling of nested parallelism

IBM Journal of Research and Development
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Compiler techniques for data synchronization in nested parallel loops

ICS '90 Proceedings of the 4th international conference on Supercomputing
Provably efficient scheduling for languages with fine-grained parallelism

Journal of the ACM (JACM)
An Iteration Partition Approach for Cache or Local Memory Thrashing on Parallel Processing

IEEE Transactions on Computers
Partitioning and Labeling of Loops by Unimodular Transformations

IEEE Transactions on Parallel and Distributed Systems
Dependence Uniformization: A Loop Parallelization Technique

IEEE Transactions on Parallel and Distributed Systems
Adaptive scheduling with parallelism feedback

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work stealing with parallelism feedback

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work-stealing with parallelism feedback

ACM Transactions on Computer Systems (TOCS)
Improved results for scheduling batched parallel jobs by using a generalized analysis framework

Journal of Parallel and Distributed Computing
FleXilicon architecture and its VLSI implementation

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Provably efficient two-level adaptive scheduling

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
Enhanced loop coalescing: a compiler technique for transforming non-uniform iteration spaces

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
A new carried-dependence self-scheduling algorithm

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and its Applications - Volume Part I

Quantified Score

Hi-index	14.98

Visualization

Abstract

A processor self-scheduling scheme is proposed for general parallel nested loops in multiprocessor systems. In this scheme, programs are instrumented to allow processors to schedule loop iterations among themselves dynamically at run time without involving the operating system. The scheme has two levels. At the low level, it uses simple fetch-and-op operations to take advantage of the regular structure in the innermost parallel loop nests; at the high level, the irregular structure of the outer loops (parallel or serial) and the IF-THEN-ELSE constructs are handled by using dynamic parallel linked lists. The larger granularity or the processes at the high level easily justifies the added overhead incurred from maintaining such dynamic data structures. The use of guided self-scheduling (GSS) and shortest-delay self-scheduling (SDSS) in this scheme is analyzed.