Avoidance and suppression of compensation code in a trace scheduling compiler

Authors:
Stefan M. Freudenberger;Thomas R. Gross;P. Geoffrey Lowney
Affiliations:
HP Laboratories;Carnegie Mellon University;Digital Equipment Corporation
Venue:
ACM Transactions on Programming Languages and Systems (TOPLAS)
Year:
1994

Citing 21
Cited 7

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
A VLIW architecture for a trace Scheduling Compiler

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Characterizing computer performance with a single number

Communications of the ACM
Global value numbers and redundant computations

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Region Scheduling: An Approach for Detecting and Redistributing Parallelism

IEEE Transactions on Software Engineering
Global instruction scheduling for superscalar machines

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Code duplication: an assist for global instruction scheduling

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Using profile information to assist classic code optimizations

Software—Practice & Experience
Efficient superscalar performance through boosting

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Program analysis and optimization for machines with instruction cache

Program analysis and optimization for machines with instruction cache
The multiflow trace scheduling compiler

The Journal of Supercomputing - Special issue on instruction-level parallelism
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
A global resource-constrained parallelization technique

ICS '89 Proceedings of the 3rd international conference on Supercomputing
An Algorithm for Structuring Flowgraphs

Journal of the ACM (JACM)
Global optimization by suppression of partial redundancies

Communications of the ACM
Parallel processing: a smart compiler and a dumb machine

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
A Guidebook to FORTRAN on Supercomputers

A Guidebook to FORTRAN on Supercomputers
Percolation Scheduling: A Parallel Compilation Technique

Percolation Scheduling: A Parallel Compilation Technique

Software pipelining

ACM Computing Surveys (CSUR)
Parallelizing nonnumerical code with selective scheduling and software pipelining

ACM Transactions on Programming Languages and Systems (TOPLAS)
Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling

CC '01 Proceedings of the 10th International Conference on Compiler Construction
The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Reducing code size in VLIW instruction scheduling

Journal of Embedded Computing - Low-power Embedded Systems
Optimal trace scheduling using enumeration

ACM Transactions on Architecture and Code Optimization (TACO)
Parallel copy motion

Proceedings of the 13th International Workshop on Software & Compilers for Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Trace scheduling is an optimization technique that selects a sequence of basic blocks as a trace and schedules the operations from the trace together. If an operation is moved across basic block boundaries, one or more compensation copies may be required in the off-trace code. This article discusses the generation of compensation code in a trace scheduling compiler and presents techniques for limiting the amount of compensation code: avoidance (restricting code motion so that no compensation code is required) and suppression (analyzing the global flow of the program to detect when a copy is redundant). We evaluate the effectiveness of these techniques based on measurements for the SPEC89 suite and the Livermore Fortran Kernels, using our implementation of trace scheduling for a Multiflow Trace 7/300. The article compares different compiler models contrasting the performance of trace scheduling with the performance obtained from typical RISC compilation techniques.There are two key results of this study. First, the amount of compensation code generated is not large. For the SPEC89 suite, the average code size increase due to trace scheduling is 6%. Avoidance is more important than suppression, although there are some kernels that benefit significantly from compensation code suppression. Since compensation code is not a major issue, a compiler can be more aggressive in code motion and loop unrolling. Second, compensation code is not critical to obtain the benefits of trace scheduling. Our implementation of trace scheduling improves the SPEC mark rating by 30% over basic block scheduling, but restricting trace scheduling so that no compensation code is required improves the rating by 25%. This indicates that most basic block scheduling techniques can be extended to trace scheduling without requiring any complicated compensation code bookkeeping.