Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
An overview for the PTRAN analysis system for multiprocessing
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Supercompilers for parallel and vector computers
Supercompilers for parallel and vector computers
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Structure of Computers and Computations
Structure of Computers and Computations
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Labeling of Loops by Unimodular Transformations
IEEE Transactions on Parallel and Distributed Systems
On Loop Transformations for Generalized Cycle Shrinking
IEEE Transactions on Parallel and Distributed Systems
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
Hi-index | 0.00 |
Abstract: Parallelization of sequential programs primarily focuses on loop structures since the array index variables in a loop usually exhibit data dependency among them. When the data dependency relation is constant in terms of distance, several compile time parallelization methods were introduced. On the other hand, when the data dependency relation varies in distance, the compile time extraction of parallelism is more complicated. We propose a generalized parallelism extraction scheme for nested loops. This method automatically converts a sequential loop into a nested parallel DOALL loop at compile time. Moreover, this algorithm can be applicable where the dependency relation is both constant and varying in distance. Our test results show the proposed scheme is superior to conventional methods when sufficiently large number of processors are provided.