Efficient tiled loop generation: D-tiling

Authors:
DaeGon Kim;Sanjay Rajopadhye
Affiliations:
Colorado State University, Fort Collins, CO, U.S.A.;Colorado State University, Fort Collins, CO, U.S.A.
Venue:
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Year:
2009

Citing 21
Cited 1

Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Scanning polyhedra with DO loops

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
A practical algorithm for exact array dependence analysis

Communications of the ACM
Communication optimization and code generation for distributed memory machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
SUIF: an infrastructure for research on parallelizing and optimizing compilers

ACM SIGPLAN Notices
Loop tiling for parallelism

Loop tiling for parallelism
Generation of Efficient Nested Loops from Polyhedra

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Tiling imperfectly-nested loop nests

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Blocking and array contraction across arbitrarily nested loops using affine partitioning

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Register tiling in nonrectangular iteration spaces

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iteration Space Tiling for Memory Hierarchies

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing
Code generation for multiple mappings

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
SableCC, an Object-Oriented Compiler Framework

TOOLS '98 Proceedings of the Technology of Object-Oriented Languages and Systems
Code Generation in the Polyhedral Model Is Easier Than You Think

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Parameterized tiled loops for free

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Multi-level tiling: M for the price of one

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Parametric multi-level tiling of imperfectly nested loops

Proceedings of the 23rd international conference on Supercomputing
An efficient code generation technique for tiled iteration spaces

IEEE Transactions on Parallel and Distributed Systems

Automatic creation of tile size selection models

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tiling is an important loop optimization for exposing coarse-grained parallelism and enhancing data locality. Tiled loop generation from an arbitrarily shaped polyhedron is a well studied problem. Except for the special case of a rectangular iteration space, the tiled loop generation problem has been long believed to require heavy machinery such as Fourier-Motzkin elimination and projection, and hence to have an exponential complexity. In this paper we propose a simple and efficient tiled loop generation technique similar to that for a rectangular iteration space. In our technique, each loop bound is adjusted only once, syntactically and independently. Therefore, our algorithm runs linearly with the number of loop bounds. Despite its simplicity, we retain several advantages of recent tiled code generation schemes—unified generation for fixed, parameterized and hybrid tiled loops, scalability for multi-level tiled loop generation with the ability to separate full tiles at any levels, and compact code. We also explore various schemes for multi-level tiled loop generation. We formally prove the correctness of our scheme and experimentally validate that the efficiency of our technique is comparable to existing parameterized tiled loop generation approaches. Our experimental results also show that multi-level tiled loop generation schemes have an impact on performance of generated code. The fact that our scheme can be implemented without sophisticated machinery makes it well suited for autotuners and production compilers.