Register pressure aware scheduling for high level synthesis

Authors:
Rami Beidas;Wai Sum Mong;Jianwen Zhu
Affiliations:
University of Toronto, ON, Canada;University of Toronto, ON, Canada;University of Toronto, ON, Canada
Venue:
Proceedings of the 16th Asia and South Pacific Design Automation Conference
Year:
2011

Citing 31
Cited 0

Code scheduling and register allocation in large basic blocks

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Integrating register allocation and instruction scheduling for RISCs

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Global scheduling independent of control dependencies based on condition vectors

DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
Fast and near optimal scheduling in automatic data path synthesis

DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Empirical evaluation of some high-level synthesis scheduling heuristics

DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
High-level synthesis: introduction to chip and system design

High-level synthesis: introduction to chip and system design
Rematerialization

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Effective compiler support for predicated execution using the hyperblock

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A schedular-sensitive global register allocator

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
An efficient implementation of a scaling minimum-cost flow algorithm

Journal of Algorithms
Incorporating speculative execution into scheduling of control-flow intensive behavioral descriptions

DAC '98 Proceedings of the 35th annual Design Automation Conference
Soft scheduling in high level synthesis

Proceedings of the 36th annual ACM/IEEE Design Automation Conference
MAHA: a program for datapath synthesis

DAC '86 Proceedings of the 23rd ACM/IEEE Design Automation Conference
The Generation of Optimal Code for Arithmetic Expressions

Journal of the ACM (JACM)
Code Generation for Expressions with Common Subexpressions

Journal of the ACM (JACM)
Can recursive bisection alone produce routable placements?

Proceedings of the 37th Annual Design Automation Conference
A low power unified cache architecture providing power and performance flexibility (poster session)

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Introduction to Algorithms

Introduction to Algorithms
Integrated Instruction Scheduling and Register Allocation Techniques

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Register Saturation in Superscalar and VLIW Codes

CC '01 Proceedings of the 10th International Conference on Compiler Construction
Convex Optimization

Convex Optimization
A unified theory of timing budget management

Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Tetris: a new register pressure control technique for VLIW processors

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Design closure driven delay relaxation based on convex cost network flow

Proceedings of the conference on Design, automation and test in Europe
Scheduling with integer time budgeting for low-power optimization

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Approximation through multicommodity flow

SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
Scheduling with soft constraints

Proceedings of the 2009 International Conference on Computer-Aided Design
Congestion minimization during placement

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Using global code motions to improve the quality of results for high-level synthesis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Optimal integer delay-budget assignment on directed acyclic graphs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Variations of list scheduling became the de-facto standard of scheduling straight line code in software compilers, a trend faithfully inherited by high-level synthesis solutions. Due to its nature, list scheduling is oblivious of the tightly coupled register pressure; a dangling fundamental problem that has been attacked by the compiler community for decades, and which results, in case of highlevel synthesis, in excessive instantiations of registers and accompanying steering logic. To alleviate this problem, we propose a synthesis framework called soft scheduling, which acts as a resource unconstrained prescheduling stage that restricts subsequent scheduling to minimize register pressure. This optimization objective is formulated as a live range minimization problem, a measure shown to be proportional to register pressure, and optimally solved in polynomial time using minimum cost network flow formulation. Unlike past solutions in the compiler community, which try to reduce register pressure by local serialization of subject instructions, the proposed solution operates on the entire basic block or hyperblock and systematically handles instruction chaining subject to the same objective. The application of the proposed solution to a set of real-life benchmarks results in a register pressure reduction ranging, on average, between 11% and 41% depending on the compilation and synthesis configurations with minor 2% to 4% increase in schedule latency.