Improved spill code generation for software pipelined loops

Authors:
Javier Zalamea;Josep Llosa;Eduard Ayguadé;Mateo Valero
Affiliations:
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, cr. Jordi Girona 1-3, Mòdul D6, Campus Nord, 08034, Barcelona, SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, cr. Jordi Girona 1-3, Mòdul D6, Campus Nord, 08034, Barcelona, SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, cr. Jordi Girona 1-3, Mòdul D6, Campus Nord, 08034, Barcelona, SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, cr. Jordi Girona 1-3, Mòdul D6, Campus Nord, 08034, Barcelona, SPAIN
Venue:
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Year:
2000

Citing 22
Cited 12

Overlapped loop support in the Cydra 5

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Spill code minimization techniques for optimizing compliers

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Coloring heuristics for register allocation

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Register allocation via hierarchical graph coloring

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Circular scheduling: a new technique to perform software pipelining

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation for software pipelined loops

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Register requirements of pipelined processors

ICS '92 Proceedings of the 6th international conference on Supercomputing
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Compiling for the Cydra 5

The Journal of Supercomputing - Special issue on instruction-level parallelism
Improvements to graph coloring register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Software pipelining with register allocation and spilling

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Software pipelining

ACM Computing Surveys (CSUR)
Stage scheduling: a technique to reduce the register requirements of a modulo schedule

Proceedings of the 28th annual international symposium on Microarchitecture
Hypernode reduction modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
Software pipelining showdown: optimal vs. heuristic methods in a production compiler

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Heuristics for register-constrained software pipelining

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Quantitative Evaluation of Register Pressure on Software Pipelined Loops

International Journal of Parallel Programming
A Systolic Array Optimizing Compiler

A Systolic Array Optimizing Compiler
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction

Two-level hierarchical register file organization for VLIW processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A comparative study of modulo scheduling techniques

ICS '02 Proceedings of the 16th international conference on Supercomputing
Modulo scheduling with integrated register spilling for clustered VLIW architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Dynamic metrics for java

OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Register Constrained Modulo Scheduling

IEEE Transactions on Parallel and Distributed Systems
Differential register allocation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Demystifying on-the-fly spill code

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Software and hardware techniques to optimize register file utilization in VLIW architectures

International Journal of Parallel Programming
Allocating architected registers through differential encoding

ACM Transactions on Programming Languages and Systems (TOPLAS)
Register allocation and optimal spill code scheduling in software pipelined loops using 0-1 integer linear programming formulation

CC'07 Proceedings of the 16th international conference on Compiler construction
MIRS: modulo scheduling with integrated register spilling

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software pipelining is a loop scheduling technique that extractsparallelism out of loops by overlapping the execution of severalconsecutive iterations. Due to the overlapping of iterations, schedules impose high register requirements during their execution. A schedule is valid if it requires at most the number of registers available in the target architecture. If not, its register requirementshave to be reduced either by decreasing the iteration overlapping or by spilling registers to memory. In this paper we describe a set of heuristics to increase the quality of register-constrained modulo schedules. The heuristics decide between the two previous alternatives and define criteria for effectively selecting spilling candidates. The heuristics proposed for reducing the register pressure can be applied to any software pipelining technique. The proposals are evaluated using a register-conscious software pipeliner on a workbench composed of a large set of loops from the Perfect Club benchmark and a set of processor configurations. Proposals in this paper are compared against a previous proposal already described in the literature. For one of these processor configurations and the set of loops that do not fit in the available registers (32), a speed-up of 1.68 and a reduction of the memory traffic by a factor of 0.57 are achieved with an affordable increase in compilation time. For all the loops, this represents a speed-up of 1.38 and a reduction of the memory traffic by a factor of 0.7.