Efficient Instruction Sequencing with Inline Target Insertion

Authors:
W. W. Hwu;P. P. Chang
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
1992

Citing 27
Cited 3

Design Decisions in SPUR

Computer
An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors

IEEE Transactions on Computers
Highly concurrent scalar processing

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Reducing the cost of branches

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Branch folding in the CRISP microprocessor: reducing branch delay to zero

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
WISQ: a restartable architecture using queues

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Architectural tradeoffs in the design of MIPS-X

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Checkpoint repair for high-performance out-of-order execution machines

IEEE Transactions on Computers
The performance potential of multiple functional unit processors

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Trace selection for compiling large C application programs to microcode

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Multiple instruction issue and single-chip processors

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Tradeoffs in instruction format design for horizontal architectures

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Inline function expansion for compiling C programs

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Comparing software and hardware schemes for reducing the cost of branches

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Control flow optimization for supercomputer scalar processing

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Multiple instruction issue in the NonStop cyclone processor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Implementation of precise interrupts in pipelined processors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
The Design of the 88000 RISC Family

IEEE Micro
Optimizing delayed branches

MICRO 15 Proceedings of the 15th annual workshop on Microprogramming
The 801 minicomputer

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A Characterization of Processor Performance in the vax-11/780

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Hpsm: exploiting concurrency to achieve high performance in a single-chip microarchitecture

Hpsm: exploiting concurrency to achieve high performance in a single-chip microarchitecture

Prophetic branches: a branch architecture for code compaction and efficient execution

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A Practical Methodology for the Formal Verification of RISC Processors

Formal Methods in System Design
Sparse code motion

Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages

Quantified Score

Hi-index	14.98

Visualization

Abstract

Inline target insertion, a specific compiler and pipeline implementation method for delayed branches with squashing, is defined. The method is shown to offer two important features not discovered in previous studies. First, branches inserted into branch slots are correctly executed. Second, the execution returns correctly from interrupts or exceptions with only one program counter. These two features result in better performance and less software/hardware complexity than conventional delayed branching mechanisms.