Partitioning and Mapping Nested Loops on Multiprocessor Systems

Authors:
J. P. Sheu;T. H. Thai
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1991

Citing 12
Cited 8

Spacetime representations of computational structures

Computing
Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays

IEEE Transactions on Computers
Digital image processing (2nd ed.)

Digital image processing (2nd ed.)
A design methodology for synthesizing parallel algorithms and architectures

Journal of Parallel and Distributed Computing
Nearest-neighbor mapping of finite element graphs onto processor meshes

IEEE Transactions on Computers
Multicomputers: Message-Passing Concurrent Computers

Computer
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms

IEEE Transactions on Computers
Minimum Distance: A Method for Partitioning Recurrences for Multiprocessors

IEEE Transactions on Computers
The parallel execution of DO loops

Communications of the ACM
The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI

IEEE Transactions on Computers
Pipelined Data Parallel Algorithms-II: Design

IEEE Transactions on Parallel and Distributed Systems
Multiprocessors: discussion of some theoretical and practical problems

Multiprocessors: discussion of some theoretical and practical problems

Compiler technology for parallel scientific computation

Scientific Programming
A Unifying Lattice-Based Approach for the Partitioning of Systolic Arrays via LPGS and LSGP

Journal of VLSI Signal Processing Systems
Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays

IEEE Transactions on Parallel and Distributed Systems
Communication-free partitioning of nested loops

Compiler optimizations for scalable parallel systems
Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Evaluation of Loop Grouping Methods Based on Orthogonal Projection Spaces

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Automatic parallel code generation for tiled nested loops

Proceedings of the 2004 ACM symposium on Applied computing
Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A method for executing nested loops with constant loop-carried dependencies in parallelon message-passing multiprocessor systems to reduce communication overhead is presented. In the partitioning phase, the nested loop is divided into blocks that reduce the interblock communication, without regard to the machine topology. The execution ordering of the iterations is defined by a given time function based on L. Lamport's (1974) hyperplane method. The iterations are then partitioned into blocks so that the execution ordering is not disturbed, and the amount of interblock communication is minimized. In the mapping phase, the partitioned blocks are mapped onto a fixed-size multiprocessor system in such a manner that the blocks that have to exchange data frequently are allocated to the same processor or neighboring processors. A heuristic mapping algorithm for hypercube machines is proposed.