Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
Theory of linear and integer programming
Theory of linear and integer programming
Integer and combinatorial optimization
Integer and combinatorial optimization
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Pathlength reduction features in the PA-RISC architecture
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Non-unimodular transformations of nested loops
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays
IEEE Transactions on Parallel and Distributed Systems
Unimodularity Considered Non-Essential
CONPAR '92/ VAPP V Proceedings of the Second Joint International Conference on Vector and Parallel Processing: Parallel Processing
A Singular Loop Transformation Framework Based on Non-Singular Matrices
A Singular Loop Transformation Framework Based on Non-Singular Matrices
Access Normalization: Loop Restructuring for NUMA Compilers
Access Normalization: Loop Restructuring for NUMA Compilers
Automatic generation of systolic programs from nested loops
Automatic generation of systolic programs from nested loops
Compile time techniques for parallel execution of loops on distributed memory multiprocessors
Compile time techniques for parallel execution of loops on distributed memory multiprocessors
Automatic Generation of Modular Time-Space Mappings and Data Alignments
Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
Automatic code generation for executing tiled nested loops onto parallel architectures
Proceedings of the 2002 ACM symposium on Applied computing
Generation of Injective and Reversible Modular Mappings
IEEE Transactions on Parallel and Distributed Systems
Automatic parallel code generation for tiled nested loops
Proceedings of the 2004 ACM symposium on Applied computing
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Hi-index | 0.00 |
Linear transformations are widely used to vectorize and parallelize loops. A subset of these transformations are unimodular transformations. When a unimodular transformation is used, the exact bounds of the transformed loop nest are easily computed and the steps of the loops are equal to 1. Unimodular loop transformations have been widely used since they permit the implementation of many useful loop transformations. Recently, nonunimodular transformations have been proposed to reduce communication requirements or to use the memory hierarchy efficiently. The methods used for unimodular transformations do not work in the case of nonunimodular transformations, since they do not produce the exact bounds of the transformed loop nest. In this paper, we present a method for nested loop transformation which gives the exact bounds for both unimodular and nonunimodular transformations. The basic idea is to use the Hermite Normal Form (HNF) of the transformation matrix.