Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
A portable global optimizer and linker
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Introduction to algorithms
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Ease: an environment for architecture study and experimentation
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Crafting a compiler with C
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Avoiding unconditional jumps by code replication
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Enhanced modulo scheduling for loops with conditional branches
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
Avoiding conditional branches by code replication
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
A new framework for exhaustive and incremental data flow analysis using DJ graphs
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
A framework for generalized control dependence
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Identifying loops using DJ graphs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Nesting of reducible and irreducible loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Making graphs reducible with controlled node splitting
ACM Transactions on Programming Languages and Systems (TOPLAS)
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Advanced compiler design and implementation
Advanced compiler design and implementation
Identifying loops in almost linear time
ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
On loops, dominators, and dominance frontier
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Flow Analysis of Computer Programs
Flow Analysis of Computer Programs
Handling Irreducible Loops: Optimized Node Splitting vs. DJ-Graphs
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Proceedings of a symposium on Compiler optimization
Register Transfer Standard
Using Hammock Graphs to Structure Programs
IEEE Transactions on Software Engineering
Parametric timing analysis and its application to dynamic voltage scaling
ACM Transactions on Embedded Computing Systems (TECS)
A study of irreducibility in C programs
Software—Practice & Experience
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Simple analysis of partial worst-case execution paths on general control flow graphs
Proceedings of the Eleventh ACM International Conference on Embedded Software
Recovering memory access patterns of executable programs
Science of Computer Programming
Hi-index | 0.00 |
This paper addresses the question of how to handle irreducible regions during optimization, which has become even more relevant for contemporary processors since recent VLIW-like architectures highly rely on instruction scheduling. The contributions of this paper are twofold. First, a method of optimized node splitting to transform irreducible regions of control flow into reducible regions is formally defined and its correctness is shown. This method is superior to approaches previously published since it reduces the number of replicated nodes by comparison. Second, three methods that handle regions of irreducible control flow are evaluated with respect to their impact on compiler optimizations. First, traditional node splitting is evaluated. Second, optimized node splitting is implemented. Third, DJ-Graphs are utilized to recognize nesting of irreducible (and reducible) loops and apply common loop optimizations extended for irreducible loops. Experiments compare the performance of these approaches with unrecognized irreducible loops that cannot be subject to loop optimizations, which is typical for contemporary compilers. Measurements show improvements of 1 to 40% for these methods of handling irreducible loops over the unoptimized case. Optimized node splitting may be chosen to retrofit existing compilers since it has the advantage that it only requires few changes to an optimizing compiler while limiting the code growth of compiled programs compared to traditional node splitting. Recognizing loops via DJ-Graphs should be chosen for new compiler developments since it requires more changes to the optimizer but does not significantly change the code size of compiled programs while yielding comparable improvements. Handling irreducible loops should even yield more benefits for exploiting instruction-level parallelism of modern architectures in the context of global instruction scheduling and optimization techniques that may introduce irreducible loops, such as enhanced modulo scheduling.