Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
A novel framework of register allocation for software pipelining
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Decomposed software pipelining with reduced register requirement
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Minimizing register requirements of a modulo schedule via optimum stage scheduling
International Journal of Parallel Programming
Schedule-independent storage mapping for loops
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
On a graph-theoretical model for cyclic register allocation
Discrete Applied Mathematics
Optimal acyclic fine-grain scheduling with cache effects for embedded and real time systems
Proceedings of the ninth international symposium on Hardware/software codesign
A unified framework for schedule and storage optimization
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
RESIS: A New Methodology for Register Optimization in Software Pipelining
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs
CC '92 Proceedings of the 4th International Conference on Compiler Construction
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Code-size conscious pipelining of imperfectly nested loops
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Software Pipelining in Nested Loops with Prolog-Epilog Merging
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
The register allocation in loops is generally performed after or during the software pipelining process. This is because doing a conventional register allocation at first step without assuming a schedule lacks the information of interferences between variable lifetime intervals. Thus, the register allocator may introduce an excessive amount of false dependences that reduce dramatically the ILP (Instruction Level Parallelism). We present a new framework for controlling the register pressure before software pipelining. This is based on inserting some anti-dependences edges (register reuse edges) labeled with reuse distances, directly on the data dependence graph. In this new graph, we are able to guarantee that the number of simultaneously alive variables in any schedule does not exceed a limit. The determination of register and distance reuse is parameterized by the desired critical circuit ratio (MII) as well as by the register pressure constraints - either can be minimized while the other one is fixed. After scheduling, register allocation is done cyclically on conventional register sets or on rotating register files. We give an optimal exact model, and another approximative one that generalizes the Ning-Gao [13] buffer optimization heuristics.