Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Distribution of mathematical software via electronic mail
Communications of the ACM
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
An approach to ordering optimizing transformations
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Automatic partitioning of a program dependence graph into parallel tasks
IBM Journal of Research and Development
A framework for unifying reordering transformations
A framework for unifying reordering transformations
Improving locality and parallelism in nested loops
Improving locality and parallelism in nested loops
The parallel execution of DO loops
Communications of the ACM
Loop Transformations for Restructuring Compilers: The Foundations
Loop Transformations for Restructuring Compilers: The Foundations
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
Sigma II: A Tool Kit for Building Parallelizing Compilers and Performance Analysis Systems
Proceedings of the IFIP WG 10.3 Workshop on Programming Environments for Parallel Computing
Transformations on Doubly Nested Loops
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Improving the performance of virtual memory computers.
Improving the performance of virtual memory computers.
Affine-by-Statement Transformations of Imperfectly Nested Loops
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Optimizing irregular shared-memory applications for distributed-memory systems
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Finding synchronization-free parallelism for non-uniform loops
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartII
A study of performance scalability by parallelizing loop iterations on multi-core SMPs
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Hi-index | 0.00 |
The development of a unimodular transformation theory and associated algorithms has renewed interest in Fortran do loops that are not perfectly (or tightly) nested. In this paper we summarize a number of techniques that convert imperfectly nested loops into perfectly nested loops. We examined over 25,000 lines of scientific Fortran kernels and benchmarks. Statistics are reported on how often imperfect loops occur and how effective two transformations (scalar forward substitution and loop distribution) are at converting imperfectly nested loops into perfectly nested loops. Further, we describe a compiler that integrates scalar forward substitution, loop distribution, and unimodular transformations while maintaining the basic philosophy of unimodular transformation theory. While our data indicate that imperfectly nested loops still present a problem, the compiler we describe is no more limited by perfectly nested loops than other restructuring compilers available today.