Wide and efficient trace prediction using the local trace predictor

Authors:
Juan C. Moure;Domingo Benítez;Dolores I. Rexachs;Emilio Luque
Affiliations:
University Autónoma of Barcelona, Barcelona, Spain;University of Las Palmas G. C., Las Palmas, Spain;University Aut. of Barcelona, Barcelona, Spain;University Aut. of Barcelona, Barcelona, Spain
Venue:
Proceedings of the 20th annual international conference on Supercomputing
Year:
2006

Citing 31
Cited 0

Two-level adaptive training branch prediction

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
A comprehensive instruction fetch mechanism for a processor supporting speculative execution

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache

ICS '93 Proceedings of the 7th international conference on Supercomputing
Fast and accurate instruction fetch and branch prediction

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Optimization of instruction fetch mechanisms for high issue rates

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Multiple-block ahead branch predictors

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Trace cache: a low latency approach to high bandwidth instruction fetching

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Trading conflict and capacity aliasing in conditional branch predictors

Proceedings of the 24th annual international symposium on Computer architecture
Path-based next trace prediction

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Alternative fetch and issue policies for the trace cache fetch mechanism

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Evaluation of Design Options for the Trace Cache Fetch Mechanism

IEEE Transactions on Computers - Special issue on cache memory and related problems
Control Flow Prediction Schemes for Wide-Issue Superscalar Processors

IEEE Transactions on Parallel and Distributed Systems
Trace preconstruction

Proceedings of the 27th annual international symposium on Computer architecture
Completion time multiple branch prediction for enhancing trace cache performance

Proceedings of the 27th annual international symposium on Computer architecture
The impact of delay on the design of branch predictors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Optimizations Enabled by a Decoupled Front-End Architecture

IEEE Transactions on Computers
Reducing set-associative cache energy via way-prediction and selective direct-mapping

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
High Performance and Energy Efficient Serial Prefetch Architecture

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Fetching instruction streams

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Selecting long atomic traces for high coverage

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Multiple Branch and Block Prediction

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Path Prediction For High Issue-Rate Processors

PACT '97 Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques
Parallelism in the front-end

Proceedings of the 30th annual international symposium on Computer architecture
Effective ahead pipelining of instruction block address generation

Proceedings of the 30th annual international symposium on Computer architecture
Phase tracking and prediction

Proceedings of the 30th annual international symposium on Computer architecture
Fast Path-Based Neural Branch Prediction

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Power Awareness through Selective Dynamically Optimized Traces

Proceedings of the 31st annual international symposium on Computer architecture
Prophet/Critic Hybrid Branch Prediction

Proceedings of the 31st annual international symposium on Computer architecture
Target encoding for efficient indirect jump prediction

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

High prediction bandwidth enables performance improvements and power reduction techniques. This paper explores a mechanism to increase prediction width (instructions per prediction) by predicting instruction traces. Our analysis shows that predicting traces including multiple branches is not significantly less accurate than predicting single branches. A novel Local Trace Predictor organization is proposed. It increases prediction width without reducing the ratio of prediction accuracy versus memory resources with respect to a Basic Block Predictor.Compared to the previously proposed Next-Trace Predictor, the Local Trace Predictor reduces memory requirements by codifying trace predictions, and by limiting the number of traces starting at the same instruction to 2 or 4. The limit lessens prediction width only slightly, and does not affect prediction accuracy. The overall result is that the Local Trace Predictor outperforms the Next-Trace Predictor for sizes higher than 12 KBytes.