Compiler techniques for maximizing fine-grain and coarse-grain parallelism in loops with uniform dependences

Authors:
Yeong-Sheng Chen;Sheng-De Wang;Chien-Min Wang
Affiliations:
Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 106;Institute of Information Science, Academia Sinica, Taipei, Taiwan 115;-
Venue:
ICS '94 Proceedings of the 8th international conference on Supercomputing
Year:
1994

Citing 20
Cited 0

Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Loop quantization: a generalized loop unwinding technique

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Minimum Distance: A Method for Partitioning Recurrences for Multiprocessors

IEEE Transactions on Computers
More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Supercompilers for parallel and vector computers

Supercompilers for parallel and vector computers
Run-Time Parallelization and Scheduling of Loops

IEEE Transactions on Computers
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies

IEEE Transactions on Computers
Independent Partitioning of Algorithms with Uniform Dependencies

IEEE Transactions on Computers
The Organization of Computations for Uniform Recurrence Equations

Journal of the ACM (JACM)
The parallel execution of DO loops

Communications of the ACM
Optimizing Supercompilers for Supercomputers

Optimizing Supercompilers for Supercomputers
Parallel Programming and Compilers

Parallel Programming and Compilers
Structure of Computers and Computations

Structure of Computers and Computations
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Efficient Processor Assignment Algorithms and Loop Transformations for Executing Nested Parallel Loops on Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays

IEEE Transactions on Parallel and Distributed Systems
Partitioning and Labeling of Loops by Unimodular Transformations

IEEE Transactions on Parallel and Distributed Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an approach to the problem of exploiting parallelism within nested loops is proposed. The proposed method first finds out all the initially independent computations, and then, based on them, identifies the valid partitioning bases to partition the entire iteration space of the loop nest. Because the shape of the iteration space is taken into account, pseudo-dependence relations are eliminated and hence more parallelism is exploited. Our approach provides a systematic method to maximize the degree of fine- or coarse-grain parallelism and is free from the open question of how to combine different loop transformations for the goal of maximizing parallelism. It is also shown that our approach can exploit more parallelism than other related work and have many advantages over them.