Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Selected papers of the second workshop on Languages and compilers for parallel computing
Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Performance analysis of tree VLIW architecture for exploiting branch ILP in non-numerical code
ICS '97 Proceedings of the 11th international conference on Supercomputing
Evaluation of scheduling techniques on a SPARC-based VLIW testbed
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Parallelizing nonnumerical code with selective scheduling and software pipelining
ACM Transactions on Programming Languages and Systems (TOPLAS)
Introducing the IA-64 Architecture
IEEE Micro
Unroll-Based Copy Elimination for Enhanced Pipeline Scheduling
IEEE Transactions on Computers
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Rotating Register Allocation for Enhanced Pipeline Scheduling
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Hi-index | 0.00 |
A rotating register file is an architectural support for software pipelining, where many registers can be renamed at once when a rotating branch is executed. It has primarily been used for overcoming the cross-iteration register overwrites in modulo-scheduled, straight-line or if-converted loops. Recently, a new technique has been proposed to use rotating registers for loops with arbitrary control flows, scheduled by enhanced pipeline scheduling (EPS). EPS generates many hard-to-delete copies to overcome the cross-iteration register overwrites, but these copies may cause a stall in addition to taking resources. The proposed technique eliminates those copies by allocating rotating registers, avoiding a serious slowdown caused by them. Unfortunately, it could not eliminate enough copies, as much as those removed by the unroll-based copy elimination technique, although both techniques employ the same abstraction called an extended live range (ELR). This is due to the limitation that only a branch edge can be a rotating branch, while any edge can be an unrolling edge. In this paper, we propose an enhanced rotating register allocation technique where we can use more than one rotating branches in order to eliminate more copies. This requires an extension of the theory of ELR and the rotating register allocation algorithm. Our experimental results indicate that our proposed technique can eliminate 20% more copies than the previous technique, which results in a performance improvement as much as more than 10%.