What can we gain by unfolding loops?

Authors:
Litong Song;Krishna Kavi
Affiliations:
University of North Texas, Denton, Texas;University of North Texas, Denton, Texas
Venue:
ACM SIGPLAN Notices
Year:
2004

Citing 21
Cited 3

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Advanced compiler optimizations for supercomputers

Communications of the ACM - Special issue on parallelism
Global value numbers and redundant computations

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Supercompilers for parallel and vector computers

Supercompilers for parallel and vector computers
The value flow graph: a program representation for optimal program transformations

Proceedings of the third European symposium on programming on ESOP '90
Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Interprocedural constant propagation: an empirical study

ACM Letters on Programming Languages and Systems (LOPLAS)
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Exploiting instruction level parallelism in the presence of conditional branches

Exploiting instruction level parallelism in the presence of conditional branches
Complete removal of redundant expressions

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A Theorem on Boolean Matrices

Journal of the ACM (JACM)
Algorithm 97: Shortest path

Communications of the ACM
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Code motion of control structures in high-level languages

POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Optimizing Supercompilers for Supercomputers

Optimizing Supercompilers for Supercomputers
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Property-Oriented Expansion

SAS '96 Proceedings of the Third International Symposium on Static Analysis
Practical Aspects of Specialization of Algol-like Programs

Selected Papers from the Internaltional Seminar on Partial Evaluation
Programming languages and their compilers: Preliminary notes

Programming languages and their compilers: Preliminary notes

A highly efficient implementation of back propagation algorithm using matrix instruction set architecture

Neural, Parallel & Scientific Computations
A highly efficient implementation of a backpropagation learning algorithm using matrix ISA

Journal of Parallel and Distributed Computing
Loop transformations in the ahead-of-time optimization of java bytecode

CC'06 Proceedings of the 15th international conference on Compiler Construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Loops in programs are the source of many optimizations for improving program performance, particularly on modern high-performance architectures as well as vector and multithreaded systems. Techniques such as loop invariant code motion, loop unrolling and loop peeling have demonstrated their utility in compiler optimizations. However, many of these techniques can only be used in very limited cases when the loops are "well-structured" and easy to analyze. For instance, loop invariant code motion works only when invariant code is inside loops; loop unrolling and loop peeling work effectively when the array references are either constants or affine functions of index variable. It is our contention that there are many opportunities overlooked by limiting the optimizations to "well structured" loops. In many cases, even "badly-structured" loops may be transformed into "well structured" loops. As a case in point, we show how some loop-dependent code can be transformed into loop-independent code by transforming the loops. Our technique described in this paper relies on unfolding the loop for several initial iterations such that more opportunities may be exposed for many other existing compiler optimization techniques such as loop invariant code motion, loop peeling, loop unrolling and so on.