Branch with masked squashing in superpipelined processors

Authors:
C.-L Su;A. M. Despain
Affiliations:
Advanced Computer Architecture Laboratory, University of Southern California;Advanced Computer Architecture Laboratory, University of Southern California
Venue:
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Year:
1994

Citing 12
Cited 2

Reducing the cost of branches

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
An evaluation of branch architectures

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Architectural tradeoffs in the design of MIPS-X

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Comparing software and hardware schemes for reducing the cost of branches

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Branch Strategies: Modeling and Optimization (Pipeline Processing)

IEEE Transactions on Computers
High-Performance Logic Programming with the Aquarius Prolog Compiler

Computer
Predicting conditional branch directions from previous runs of a program

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Prophetic branches: a branch architecture for code compaction and efficient execution

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Fast Prolog with an extended general purpose architecture

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A Prolog Benchmark Suite for Aquarius

A Prolog Benchmark Suite for Aquarius

Minimizing branch misprediction penalties for superpipelined processors

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Cache design trade-offs for power and performance optimization: a case study

ISLPED '95 Proceedings of the 1995 international symposium on Low power design

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of a superpipeline processor heavily relies on its branch performance. Traditional branch strategies used in pipelined processors are delayed branches and branch with squashing. Delayed branches use safe instructions to fill delay slots. However, for a deeply pipelined processor, a compiler may not be able to find sufficient safe instructions to fill the branch delay slots. Branch with squashing takes advantage of using instructions in target basic blocks to fill the branch delay slots. However, the penalty of branch misprediction is large in superpipelined processors.In this paper, we proposed a novel branch scheme, named branch with masked squashing, which takes advantage of both delayed branch and branch with squashing. The basic idea is to fill delay slots with safe instructions which may come from above or after the branch. For the remaining unfilled delay slots, we fill with instructions from the predicted target path. In the case of misprediction, only unsafe instructions are annulled. The safe instructions in branch delay slots are always executed. Simulation results show that this branch strategy performs better than traditional delayed branch and branch with squashing.