Unroll-based register coalescing

Authors:
Suhyun Kim;Soo-Mook Moon;Jinpyo Park;Kemal Ebcioğlu
Affiliations:
School of Electrical Engineering, Seoul National University, Seoul 151-742, Korea;School of Electrical Engineering, Seoul National University, Seoul 151-742, Korea;School of Electrical Engineering, Seoul National University, Seoul 151-742, Korea;IBM T. J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
Venue:
Proceedings of the 14th international conference on Supercomputing
Year:
2000

Citing 9
Cited 2

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture

Selected papers of the second workshop on Languages and compilers for parallel computing
Improving register allocation for subscripted variables

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Improvements to graph coloring register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterated register coalescing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Evaluation of scheduling techniques on a SPARC-based VLIW testbed

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Parallelizing nonnumerical code with selective scheduling and software pipelining

ACM Transactions on Programming Languages and Systems (TOPLAS)
Data Flow and Dependence Analysis for Instruction Level Parallelism

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming

Optimal software pipelining of loops with control flows

ICS '02 Proceedings of the 16th international conference on Supercomputing
Revisiting graph coloring register allocation: a study of the chaitin-briggs and callahan-koblenz algorithms

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aggressive instruction scheduling leaves behind many renaming copy instructions that cannot be coalesced due to interferences. These copies take resources, and more seriously, they may cause a stall if they are generated for renaming of multi-latency instructions. This paper proposes a code transformation technique based on loop unrolling which makes those copies coalescible. Two unique features of the technique are its method of determining the precise unroll amount based on an idea of extended live range, and its insertion of special bookkeeping copies at loop exits. In fact, the technique provides a more general and simpler solution for the cross-iteration register overwrite problem in software pipelining which works for loops with control flows as well as for straight-line loops. In addition, it is applicable to other optimizations including path length reduction and redundant subscripted reference elimination.Our empirical study performed on a 16-ALU VLIW testbed with a two-cycle load latency shows that 86% of the otherwise uncoalescible copies in innermost loops become coalescible when unrolled 2.2 times on average. In addition, it is demonstrated that the unroll amount obtained is precise and the most efficient. The unrolled version of the VLIW code includes fewer no-op VLIWs caused by stalls, improving the performance by a geometric mean of 18%.