Latency and energy aware value prediction for high-frequency processors

Authors:
Ravi Bhargava;Lizy K. John
Affiliations:
The University of Texas at Austin, Austin, Texas;The University of Texas at Austin, Austin, Texas
Venue:
ICS '02 Proceedings of the 16th international conference on Supercomputing
Year:
2002

Citing 22
Cited 8

Improving CISC instruction decoding performance using a fill unit

Proceedings of the 28th annual international symposium on Microarchitecture
Value locality and load value prediction

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The predictability of data values

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Highly accurate data value prediction using hybrid predictors

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The potential of data value speculation to boost ILP

ICS '98 Proceedings of the 12th international conference on Supercomputing
The effect of instruction fetch bandwidth on value prediction

Proceedings of the 25th annual international symposium on Computer architecture
Selective value prediction

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A scalable front-end architecture for fast instruction delivery

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Storageless value prediction using prior register values

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Software trace cache

ICS '99 Proceedings of the 13th international conference on Supercomputing
Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
Focusing processor policies via critical-path prediction

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Implementation of Hybrid Context Based Value Predictors Using Value Sequence Classification

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Efficacy and Performance Impact of Value Prediction

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
On Some Implementation Issues for Value Prediction on Wide-Issue ILP Processors

PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Hybridizing and Coalescing Load Value Predictors

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
A Power Perspective of Value Speculation for Superscalar Microprocessors

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Dynamic Prediction of Critical Path Instructions

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Trace cache design for wide-issue superscalar processors

Trace cache design for wide-issue superscalar processors
Shade: A Fast Instruction Set Simulator for Execution Profiling

Shade: A Fast Instruction Set Simulator for Execution Profiling

Improving dynamic cluster assignment for clustered trace cache processors

Proceedings of the 30th annual international symposium on Computer architecture
On the energy-efficiency of speculative hardware

Proceedings of the 2nd conference on Computing frontiers
Speculative trivialization point advancing in high-performance processors

Journal of Systems Architecture: the EUROMICRO Journal
Partial resolution for redundant operation table

Microprocessors & Microsystems
Energy-performance design space exploration in SMT architectures exploiting selective load value predictions

Proceedings of the Conference on Design, Automation and Test in Europe
Leakage-efficient design of value predictors through state and non-state preserving techniques

The Journal of Supercomputing
Making power-efficient data value predictions

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Improving energy efficiency via speculative multithreading on multicore processors

PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work addresses the issues of access latency and energy consumption in value predictor design for high-frequency, wide-issue microprocessors. Previous value prediction research allows for generous assumptions regarding table configurations and access conditions, while ignoring prediction latencies and energy issues. However, the latency of a high-performance value predictor cannot always be completely hidden by the early stages of the instruction pipeline as previously assumed, and it causes noticeable performance degradation versus unconstrained value prediction. This paper describes and compares several variations of basic value prediction methods: at fetch, post-decode, and decoupled.The performance of at-fetch and post-decode value predictors is limited by the high access latency of accurate predictor configurations. Decoupled value prediction excels at overcoming the high-frequency table access constraints by placing completion-time predictions into a separate and easily accessible storage. However, it has high energy requirements. We study a value prediction approach that combines the latency-friendly approach of decoupled value prediction with a more energy-efficient implementation. The traditional PC-indexed prediction tables are removed and replaced by a queue of prediction traces. This latency and energy aware method of maintaining and distributing speculated values leads to a 58%-95% reduction in value predictor energy consumption versus known value prediction techniques while still maintaining high performance.