Code scheduling and register allocation in large basic blocks
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Spill code minimization techniques for optimizing compliers
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Scheduling arithmetic and load operations in parallel with no spilling
SIAM Journal on Computing
Register allocation via hierarchical graph coloring
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation with instruction scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
A schedular-sensitive global register allocator
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Minimizing register requirements under resource-constrained rate-optimal software pipelining
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
CRAIG: a practical framework for combining instruction scheduling and register assignment
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Allocating registers in multiple instruction-issuing processors
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Optimal software pipelining with function unit and register constraints
Optimal software pipelining with function unit and register constraints
Spill code minimization via interference region spilling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
On a graph-theoretical model for cyclic register allocation
Discrete Applied Mathematics
Linear scan register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Modulo scheduling with integrated register spilling for clustered VLIW architectures
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Dual-Issue Scheduling for Binary Trees with Spills and Pipelined Loads
SIAM Journal on Computing
IEEE Transactions on Computers
Dependence-Conscious Global Register Allocation
Proceedings of the International Conference on Programming Languages and System Architectures
URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors
PACT '97 Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Unification of register allocation and instruction scheduling in compilers for fine-grain parallel architectures
Improving Load/Store Queues Usage in Scientific Computing
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Concurrency and Computation: Practice & Experience - 10th International Workshop on Compilers for Parallel Computers (CPC 2003)
Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Tetris: a new register pressure control technique for VLIW processors
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
On Periodic Register Need in Software Pipelining
IEEE Transactions on Computers
Periodic register saturation in innermost loops
Parallel Computing
Tetris-XL: A performance-driven spill reduction technique for embedded VLIW processors
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
The registers constraints are usually taken into account during the scheduling pass of an acyclic data dependence graph (DAG): any schedule of the instructions inside a basic block must bound the register requirement under a certain limit. In this work, we show how to handle the register pressure before the instruction scheduling of a DAG. We mathematically study an approach which consists in managing the exact upper-bound of the register need for all the valid schedules of a considered DAG, independently of the functional unit constraints. We call this computed limit the register saturation (RS) of the DAG. Its aim is to detect possible obsolete register constraints, i.e., when RS does not exceed the number of available registers. If it does, we add some serial edges to the original DAG such that the worst register need does not exceed the number of available registers. We propose an appropriate mathematical formalism for this problem. Our generic processor model takes into account superscalar, VLIW and EPIC/IA64 architectures. Our deeper analysis of the problem and our formal methods enable us to provide nearly optimal heuristics and strategies for register optimization in the face of ILP.