Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Efficient instruction scheduling for a pipelined architecture
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Code scheduling and register allocation in large basic blocks
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Instruction scheduling for the IBM RISC System/6000 processor
IBM Journal of Research and Development
The priority-based coloring approach to register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Integrating register allocation and instruction scheduling for RISCs
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Linear-time, optimal code scheduling for delayed-load architectures
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Register allocation with instruction scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Improvements to graph coloring register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Register allocation over the program dependence graph
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Spill-free parallel scheduling of basic blocks
Proceedings of the 28th annual international symposium on Microarchitecture
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimal software pipelining with function unit and register constraints
Optimal software pipelining with function unit and register constraints
Optimal and near-optimal global register allocations using 0–1 integer programming
Software—Practice & Experience
Quality and speed in linear-scan register allocation
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Advanced compiler design and implementation
Advanced compiler design and implementation
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The Generation of Optimal Code for Arithmetic Expressions
Journal of the ACM (JACM)
Code Generation for a One-Register Machine
Journal of the ACM (JACM)
Linear scan register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimal spilling for CISC machines with few registers
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Advanced Topics in Dataflow Computing and Multithreading
Advanced Topics in Dataflow Computing and Multithreading
The Design of an Optimizing Compiler
The Design of an Optimizing Compiler
Minimum Register Instruction Sequence Problem: Revisiting Optimal Code Generation for DAGs
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Scheduling Expression DAGs for Minimal Register Need
PLILP '96 Proceedings of the 8th International Symposium on Programming Languages: Implementations, Logics, and Programs
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors
PACT '97 Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques
A New Framework for Integrated Global Local Scheduling
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Combining Register Allocation and Instruction Scheduling
Combining Register Allocation and Instruction Scheduling
Data-Dependency Graph Transformations for Instruction Scheduling
Journal of Scheduling
Register saturation in instruction level parallelism
International Journal of Parallel Programming
Prematerialization: reducing register pressure for free
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Tetris: a new register pressure control technique for VLIW processors
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Periodic register saturation in innermost loops
Parallel Computing
Tetris-XL: A performance-driven spill reduction technique for embedded VLIW processors
ACM Transactions on Architecture and Code Optimization (TACO)
Fine-grain stacked register allocation for the itanium architecture
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Performance characterization of the 64-bit x86 architecture from compiler optimizations' perspective
CC'06 Proceedings of the 15th international conference on Compiler Construction
Optimal and heuristic global code motion for minimal spilling
CC'13 Proceedings of the 22nd international conference on Compiler Construction
ACM Transactions on Architecture and Code Optimization (TACO)
International Journal of Parallel Programming
Hi-index | 14.98 |
In this paper, we address the problem of generating an optimal instruction sequence S for a Directed Acyclic Graph (DAG), where S is optimal in terms of the number of registers used. We call this the Minimum Register Instruction Sequence (MRIS) problem. The motivation for revisiting the MRIS problem stems from several modern architecture innovations/requirements that has put the instruction sequencing problem in a new context. We develop an efficient heuristic solution for the MRIS problem. This solution is based on the notion of instruction lineage驴a set of instructions that can definitely share a single register. The formation of lineages exploits the structure of the dependence graph to facilitate the sharing of registers not only among instructions within a lineage, but also across lineages. Our efficient heuristics to 驴fuse驴 lineages further reduce the register requirement. This reduced register requirement results in generating a code sequence with fewer register spills. We have implemented our solution in the MIPSpro production compiler and measured its performance on the SPEC95 floating point benchmark suite. Our experimental results demonstrate that the proposed instruction sequencing method significantly reduces the number of spill loads and stores inserted in the code, by more than 50 percent in each of the benchmarks. Our approach reduces the average number of dynamic loads and stores executed by 10.4 percent and 3.7 percent, respectively. Further, our approach improves the execution time of the benchmarks on an average by 3.2 percent. In order to evaluate how efficiently our heuristics find a near-optimal solution to the MRIS problem, we develop an elegant integer linear programming formulation for the MRIS problem. Using a commercial integer linear programming solver, we obtain the optimal solution for the MRIS problem. Comparing the optimal solution from the integer linear programming tool with our heuristic solution reveals that, in a very large majority (99.2 percent) of the cases, our heuristic solution is optimal. For this experiment, we used a set of 675 dependence graphs representing basic blocks extracted from scientific benchmark programs.