Heuristics for register-constrained software pipelining

Authors:
Josep Llosa;Mateo Valero;Eduard Ayguadé
Affiliations:
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, SPAIN
Venue:
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Year:
1996

Citing 24
Cited 21

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Overlapped loop support in the Cydra 5

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Spill code minimization techniques for optimizing compliers

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Coloring heuristics for register allocation

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Parallelization of loops with exits on pipelined architectures

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Register allocation via hierarchical graph coloring

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Circular scheduling: a new technique to perform software pipelining

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation for software pipelined loops

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Register requirements of pipelined processors

ICS '92 Proceedings of the 6th international conference on Supercomputing
Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Compiling for the Cydra 5

The Journal of Supercomputing - Special issue on instruction-level parallelism
Improvements to graph coloring register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Software pipelining with register allocation and spilling

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Software pipelining

ACM Computing Surveys (CSUR)
The meeting graph: a new model for loop cyclic register allocation

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Modulo scheduling with multiple initiation intervals

Proceedings of the 28th annual international symposium on Microarchitecture
Stage scheduling: a technique to reduce the register requirements of a modulo schedule

Proceedings of the 28th annual international symposium on Microarchitecture
Hypernode reduction modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
A Systolic Array Optimizing Compiler

A Systolic Array Optimizing Compiler
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction

Modulo Scheduling with Reduced Register Pressure

IEEE Transactions on Computers
Quantitative Evaluation of Register Pressure on Software Pipelined Loops

International Journal of Parallel Programming
Widening resources: a cost-effective technique for aggressive ILP architectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Probabilistic Loop Scheduling for Applications with Uncertain Execution Time

IEEE Transactions on Computers
Improved spill code generation for software pipelined loops

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Two-level hierarchical register file organization for VLIW processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Register pressure responsive software pipelining

Proceedings of the 2001 ACM symposium on Applied computing
Cost-Conscious Strategies to Increase Performance of Numerical Programs on Aggressive VLIW Architectures

IEEE Transactions on Computers
Modulo scheduling with integrated register spilling for clustered VLIW architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A Spill Code Placement Framework for Code Scheduling

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Register Constrained Modulo Scheduling

IEEE Transactions on Parallel and Distributed Systems
Register allocation for software pipelined multi-dimensional loops

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Differential register allocation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Demystifying on-the-fly spill code

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Software and hardware techniques to optimize register file utilization in VLIW architectures

International Journal of Parallel Programming
Allocating architected registers through differential encoding

ACM Transactions on Programming Languages and Systems (TOPLAS)
Register allocation and optimal spill code scheduling in software pipelined loops using 0-1 integer linear programming formulation

CC'07 Proceedings of the 16th international conference on Compiler construction
MIRS: modulo scheduling with integrated register spilling

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Design exploration framework under impreciseness based on register-constrained inclusion scheduling

ASIAN'04 Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday
Increasing software-pipelined loops in the itanium-like architecture

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Parallelism improvements of software pipelining by combining spilling with rematerialization

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I

Quantified Score

Hi-index	0.01

Visualization

Abstract

Software Pipelining is a loop scheduling technique that extracts parallelism from loops by overlapping the execution of several consecutive iterations. There has been a significant effort to produce throughput-optimal schedules under resource constraints, and more recently to produce throughput-optimal schedules with minimum register requirements. Unfortunately even a throughput-optimal schedule with minimum register requirements is useless if it requires more registers than those available in the target machine. This paper evaluates several techniques for producing register-constrained modulo schedules: increasing the initiation interval (II) and adding spill code. We show that, in general, increasing the II performs poorly and might not converge for some loops. The paper also presents an iterative spilling mechanism that can be applied to any software pipelining technique and proposes several heuristics in order to speed-up the scheduling process.