Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Computer
Data dependence and its application to parallel processing
International Journal of Parallel Programming
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Supercompilers for parallel and vector computers
Supercompilers for parallel and vector computers
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The DINO parallel programming language
Journal of Parallel and Distributed Computing
Compiling programs for nonshared memory machines
Compiling programs for nonshared memory machines
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Communication-free hyperplane partitioning of nested loops
Journal of Parallel and Distributed Computing
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
Evaluating compiler optimizations for Fortran D
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
Parallel Computing - Special double issue: SUPRENUM and GENESIS
Affine-by-statement scheduling of uniform and affine loop nests over parametric domains
Journal of Parallel and Distributed Computing
A Unified Framework for Optimizing Communication in Data-Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
The parallel execution of DO loops
Communications of the ACM
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Partitioning and Mapping Nested Loops on Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Compiling Global Name-Space Parallel Loops for Distributed Execution
IEEE Transactions on Parallel and Distributed Systems
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Mapping affine loop nests: new results
HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
How to Optimize Residual Communications?
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers
LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
Communication-Free Parallelization via Affine Transformations
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Compile time techniques for parallel execution of loops on distributed memory multiprocessors
Compile time techniques for parallel execution of loops on distributed memory multiprocessors
Hi-index | 0.00 |
This Chapter introduces several communication-free partitioning techniques of nested loops in literature. Since the cost of data communication is much higher than that of a primitive computation in distributed-memory multicomputers, it is important to reduce data communication. The ideal situation is to eliminate data communication when it is possible. During the last few years, many techniques investigating how to achieve communication-free by partitioning nested loops are proposed. This Chapter makes a survey of these techniques and points out the differences among them.