Efficient instruction scheduling for a pipelined architecture

Authors:
Philip B. Gibbons;Steven S. Muchnick
Affiliations:
Hewlett-Packard Laboratories;Hewlett-Packard Laboratiorie
Venue:
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Year:
1986

Citing 6
Cited 76

Effectiveness of a machine-level, global optimizer

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Postpass Code Optimization of Pipeline Constraints

ACM Transactions on Programming Languages and Systems (TOPLAS)
Retargetable high-level alias analysis

POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Coding guidelines for pipelined processors

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
An overview of the PL.8 compiler

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Local code generation and compaction in optimizing microcode compilers

Local code generation and compaction in optimizing microcode compilers

Effectiveness of a machine-level, global optimizer

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
WISQ: a restartable architecture using queues

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
DOC: a practical approach to source-level debugging of globally optimized code

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Compiling C for vectorization, parallelization, and inline expansion

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Code scheduling and register allocation in large basic blocks

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Scheduling expressions on a pipelined processor with a maximal delay of one cycle

ACM Transactions on Programming Languages and Systems (TOPLAS)
On reordering instruction streams for pipelined computers

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
On the Minimization of Loads/Stores in Local Register Allocation

IEEE Transactions on Software Engineering
Instruction scheduling beyond basic blocks

IBM Journal of Research and Development
Scheduling time-critical instructions on RISC machines

POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Integrating register allocation and instruction scheduling for RISCs

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Circular scheduling: a new technique to perform software pipelining

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The Marion system for retargetable instruction scheduling

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Global instruction scheduling for superscalar machines

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Linear-time, optimal code scheduling for delayed-load architectures

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The effect on RISC performance of register set size and structure versus code generation strategy

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Efficient DAG construction and heuristic calculation for instruction scheduling

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Code duplication: an assist for global instruction scheduling

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Software support for speculative loads

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Performance evaluation of instruction scheduling on the IBM RISC System/6000

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Performance evaluation for various configuration of superscalar processors

ACM SIGARCH Computer Architecture News
Precise instruction scheduling without a precise machine model

ACM SIGARCH Computer Architecture News
Register allocation with instruction scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Balanced scheduling: instruction scheduling when memory latency is uncertain

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Scheduling time-critical instructions on RISC machines

ACM Transactions on Programming Languages and Systems (TOPLAS)
A novel framework of register allocation for software pipelining

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
InSyn: integrated scheduling for DSP applications

DAC '93 Proceedings of the 30th international Design Automation Conference
A schedular-sensitive global register allocator

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Partial dead code elimination

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Improving balanced scheduling with compiler optimizations that increase instruction-level parallelism

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Efficient instruction scheduling for delayed-load architectures

ACM Transactions on Programming Languages and Systems (TOPLAS)
Abstract interpretation and low-level code optimization

PEPM '95 Proceedings of the 1995 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Register allocation sensitive region scheduling

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
The performance impact of incomplete bypassing in processor pipelines

Proceedings of the 28th annual international symposium on Microarchitecture
An experimental study of several cooperative register allocation and instruction scheduling strategies

Proceedings of the 28th annual international symposium on Microarchitecture
Anticipatory instruction scheduling

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
An instruction reoderer for pipelined computers

MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Optimization on instruction reorganization

MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Experiences with Cooperating Register Allocation and Instruction Scheduling

International Journal of Parallel Programming
Computer systems “conference” for teaching communication skills

SIGCSE '99 The proceedings of the thirtieth SIGCSE technical symposium on Computer science education
Resource usage models for instruction scheduling: two new models and a classification

ICS '99 Proceedings of the 13th international conference on Supercomputing
Code generation of nested loops for DSP processors with heterogeneous registers and structural pipelining

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Retargetable compilation for low power

Proceedings of the ninth international symposium on Hardware/software codesign
Speeding up control-dominated applications through microarchitectural customizations in embedded processors

Proceedings of the 38th annual Design Automation Conference
A brief survey of papers on scheduling for pipelined processors

ACM SIGPLAN Notices
Scheduling time-constrained instructions on pipelined processors

ACM Transactions on Programming Languages and Systems (TOPLAS)
Backtracking-Based Instruction Scheduling to Fill Branch Delay Slots

International Journal of Parallel Programming
Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures

IEEE Transactions on Computers
Minimum Register Instruction Sequence Problem: Revisiting Optimal Code Generation for DAGs

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Minimum Register Instruction Scheduling: A New Approach for Dynamic Instruction Issue Processors

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Just-In-Time Java? Compilation for the Itanium® Processor

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Code Generation for Multi-Threaded Architectures from Dataflow Graphs

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Eliminating Exception Constraints of Java Programs for IA-64

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Instruction fetch deferral using static slack

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Improving quasi-dynamic schedules through region slip

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Meta optimization: improving compiler heuristics with machine learning

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
MicroUnity Software Development Environment

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
An Enhanced Co-Scheduling Method using Reduced MS-State Diagrams

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Register allocation for optimal loop scheduling

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Efficient instruction scheduling for a pipelined architecture

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Balanced scheduling: instruction scheduling when memory latency is uncertain

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Inducing heuristics to decide whether to schedule

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
A reprogrammable customization framework for efficient branch resolution in embedded processors

ACM Transactions on Embedded Computing Systems (TECS)
A design flow for configurable embedded processors based on optimized instruction set extension synthesis

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Compiler optimization of embedded applications for an adaptive SoC architecture

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
A framework for reducing instruction scheduling overhead in dynamic compilers

CASCON '06 Proceedings of the 2006 conference of the Center for Advanced Studies on Collaborative research
Instruction Scheduling Across Control Flow

Scientific Programming
CellSs: Scheduling techniques to better exploit memory hierarchy

Scientific Programming - High Performance Computing with the Cell Broadband Engine
Interacting code motion transformations: their impact and their complexity

Interacting code motion transformations: their impact and their complexity
Genetic programming applied to compiler heuristic optimization

EuroGP'03 Proceedings of the 6th European conference on Genetic programming
Eliminating false phase interactions to reduce optimization phase order search space

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
An efficient heuristic for instruction scheduling on clustered vliw processors

CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Integrated instruction scheduling and fine-grain register allocation for embedded processors

SAMOS'06 Proceedings of the 6th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Scheduling expression DAGs for minimal register need

Computer Languages

Quantified Score

Hi-index	0.01

Visualization

Abstract

As part of an effort to develop an optimizing compiler for a pipelined architecture, a code reorganization algorithm has been developed that significantly reduces the number of runtime pipeline interlocks. In a pass after code generation, the algorithm uses a dag representation to heuristically schedule the instructions in each basic block.Previous algorithms for reducing pipeline interlocks have had worst-case runtimes of at least O (n4). By using a dag representation which prevents scheduling deadlocks and a selection method that requires no lookahead, the resulting algorithm reorganizes instructions almost as effectively in practice, while having an O (n2) worst-case runtime.