Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Selected papers of the second workshop on Languages and compilers for parallel computing
Circular scheduling: a new technique to perform software pipelining
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Improvements to graph coloring register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
ACM Computing Surveys (CSUR)
Resource-Constrained Software Pipelining
IEEE Transactions on Parallel and Distributed Systems
Modulo scheduling with multiple initiation intervals
Proceedings of the 28th annual international symposium on Microarchitecture
ACM Transactions on Programming Languages and Systems (TOPLAS)
Software pipelining loops with conditional branches
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Performance analysis of tree VLIW architecture for exploiting branch ILP in non-numerical code
ICS '97 Proceedings of the 11th international conference on Supercomputing
Evaluation of scheduling techniques on a SPARC-based VLIW testbed
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Parallelizing nonnumerical code with selective scheduling and software pipelining
ACM Transactions on Programming Languages and Systems (TOPLAS)
Split-path enhanced pipeline scheduling for loops with control flows
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data Flow and Dependence Analysis for Instruction Level Parallelism
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Hi-index | 0.00 |
Enhanced pipeline scheduling (EPS) is a software pipelining technique which can achieve a variable initiation interval (II) for loops with control flows via its code motion pipelining. EPS, however, leaves behind many renaming copy instructions that cannot be coalesced due to interferences. These copies take resources, and more seriously, they may cause a stall if they rename a multi-latency instruction whose latency is longer than the II aimed for by EPS. This paper describes how those renaming copies can be deleted through unrolling, which enables EPS to avoid a serious slowdown from latency handling and resource pressure while keeping its variable II and other advantages. In fact, EPS's renaming through copies, followed by unrollbased copy elimination, provides a more general and simpler solution to the cross-iteration register overwrite problem in software pipelining which works for loops with control flows as well as for straight-line loops. Our empirical study performed on a VLIW testbed with a two-cycle load latency shows that the unrolled version of the 16-ALU VLIW code includes fewer no-op VLIWs caused by stalls, improving the performance by a geometric mean of 18%, yet the peak improvement with a longer latency reaches as much as a geometric mean of 25%.