Transformations for imperfectly nested loops

Authors:
Induprakas Kodukula;Keshav Pingali
Affiliations:
Department of Computer Science, Cornell University, Ithaca, NY;Department of Computer Science, Cornell University, Ithaca, NY
Venue:
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Year:
1996

Citing 9
Cited 14

VLSI array processors

VLSI array processors
A theory of loop permutations

Selected papers of the second workshop on Languages and compilers for parallel computing
Scanning polyhedra with DO loops

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A practical algorithm for exact array dependence analysis

Communications of the ACM
Partitioning the statement per iteration space using non-singular matrices

ICS '93 Proceedings of the 7th international conference on Supercomputing
Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
A singular loop transformation framework based on non-singular matrices

International Journal of Parallel Programming
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Code generation for multiple mappings

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)

Sparse code generation for imperfectly nested loops with dependences

ICS '97 Proceedings of the 11th international conference on Supercomputing
New tiling techniques to improve cache temporal locality

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
SMARTS: exploiting temporal locality and parallelism through vertical execution

ICS '99 Proceedings of the 13th international conference on Supercomputing
Array dataflow analysis

Compiler optimizations for scalable parallel systems
Static and Dynamic Locality Optimizations Using Integer Linear Programming

IEEE Transactions on Parallel and Distributed Systems
A Layout-Conscious Iteration Space Transformation Technique

IEEE Transactions on Computers
A Compiler Framework for Tiling Imperfectly-Nested Loops

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Compile-time composition of run-time data and iteration reorderings

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Automatic tiling of iterative stencil loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
Loop Optimization using Hierarchical Compilation and Kernel Decomposition

Proceedings of the International Symposium on Code Generation and Optimization
Iterative compilation with kernel exploration

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Optimizing data locality using array tiling

Proceedings of the International Conference on Computer-Aided Design
A data layout optimization framework for NUCA-based multicores

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Reshaping cache misses to improve row-buffer locality in multicore systems

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

Loop transformations are critical for compiling high-performance code for modern computers. Existing work has focused on transformations for perfectly nested loops (that is, loops in which all assignment statements are contained within the innermost loop of a loop nest). In practice, most loop nests, such as those in matrix factorization codes, are imperfectly nested. In some programs, imperfectly nested loops can be transformed into perfectly nested loops by loop distribution, but this is not always legal. In this paper, we present an approach to transforming imperfectly nested loops directly. Our approach is an extension of the linear loop transformation framework for perfectly nested loops, and it models permutation, reversal, skewing, scaling, alignment, distribution and jamming. We also give a completion procedure which generates a complete transformation from a partial transformation.