Increasing the instruction fetch rate via multiple branch prediction and a branch address cache

Authors:
Tse-Yu Yeh;Deborah T. Marr;Yale N. Patt
Affiliations:
Univ. of Michigan, Ann Arbor;Univ. of Michigan, Ann Arbor;Univ. of Michigan, Ann Arbor
Venue:
ICS '93 Proceedings of the 7th international conference on Supercomputing
Year:
1993

Citing 10
Cited 44

A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs

Computer
Single instruction stream parallelism is greater than two

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Two-level adaptive training branch prediction

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Alternative implementations of two-level adaptive branch prediction

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving the accuracy of dynamic branch prediction using branch correlation

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A comprehensive instruction fetch mechanism for a processor supporting speculative execution

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A comparison of dynamic branch predictors that use two levels of branch history

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

Control flow prediction with tree-like subgraphs for superscalar processors

Proceedings of the 28th annual international symposium on Microarchitecture
Multiple-block ahead branch predictors

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Trace cache: a low latency approach to high bandwidth instruction fetching

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Increasing the instruction fetch rate via block-structured instruction set architectures

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Control flow prediction for dynamic ILP processors

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Branch history table indexing to prevent pipeline bubbles in wide-issue superscalar processors

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Data caches for superscalar processors

ICS '97 Proceedings of the 11th international conference on Supercomputing
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences

Proceedings of the 24th annual international symposium on Computer architecture
Exploiting instruction level parallelism in processors by caching scheduled groups

Proceedings of the 24th annual international symposium on Computer architecture
Path-based next trace prediction

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Speculative multithreaded processors

ICS '98 Proceedings of the 12th international conference on Supercomputing
The effect of instruction fetch bandwidth on value prediction

Proceedings of the 25th annual international symposium on Computer architecture
Modeled and Measured Instruction Fetching Performance for Superscalar Microprocessors

IEEE Transactions on Parallel and Distributed Systems
A Trace Cache Microarchitecture and Evaluation

IEEE Transactions on Computers - Special issue on cache memory and related problems
Evaluation of Design Options for the Trace Cache Fetch Mechanism

IEEE Transactions on Computers - Special issue on cache memory and related problems
Decoupling local variable accesses in a wide-issue superscalar processor

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
The block-based trace cache

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Control Flow Prediction Schemes for Wide-Issue Superscalar Processors

IEEE Transactions on Parallel and Distributed Systems
Software trace cache

ICS '99 Proceedings of the 13th international conference on Supercomputing
A comparison of scalable superscalar processors

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Access region locality for high-bandwidth processor memory system design

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Completion time multiple branch prediction for enhancing trace cache performance

Proceedings of the 27th annual international symposium on Computer architecture
The impact of delay on the design of branch predictors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A High-Bandwidth Memory Pipeline for Wide Issue Processors

IEEE Transactions on Computers
Boosting trace cache performance with nonhead miss speculation

ICS '02 Proceedings of the 16th international conference on Supercomputing
Two cache lines prediction for a wide-issue micro-architecture

ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures

International Journal of Parallel Programming
An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors

International Journal of Parallel Programming
Software Trace Cache for Commercial Applications

International Journal of Parallel Programming
On the Performance of Fetch Engines Running DSS Workloads

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Fetching instruction streams

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
The Ultrascalar Processor-An Asymptotically Scalable Superscalar Microarchitecture

ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Reconsidering Complex Branch Predictors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Effective ahead pipelining of instruction block address generation

Proceedings of the 30th annual international symposium on Computer architecture
Programming skills for a changing world: back to the basics

Journal of Computing Sciences in Colleges
A low-complexity fetch architecture for high-performance superscalar processors

ACM Transactions on Architecture and Code Optimization (TACO)
Software Trace Cache

IEEE Transactions on Computers
Improving branch prediction accuracy with parallel conservative correctors

Proceedings of the 2nd conference on Computing frontiers
Branch predictor design and performance estimation for a high performance embedded microprocessor

ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Wide and efficient trace prediction using the local trace predictor

Proceedings of the 20th annual international conference on Supercomputing
Evaluating trace cache energy efficiency

ACM Transactions on Architecture and Code Optimization (TACO)
VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization

Proceedings of the 34th annual international symposium on Computer architecture
The Design and Evaluation of a Selective Way Based Trace Cache

APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Do trace cache, value prediction and prefetching improve SMT throughput?

ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems

Quantified Score

Hi-index	0.01

Increasing the instruction fetch rate via multiple branch prediction and a branch address cache

Quantified Score

Visualization

Abstract