Hybrid Load-Value Predictors

Authors:
Martin Burtscher;Benjamin G. Zorn
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
2002

Citing 21
Cited 9

A comparison of dynamic branch predictors that use two levels of branch history

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Value locality and load value prediction

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Speculative execution via address prediction and data prefetching

ICS '97 Proceedings of the 11th international conference on Supercomputing
The predictability of data values

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Value profiling

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Highly accurate data value prediction using hybrid predictors

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The effect of instruction fetch bandwidth on value prediction

Proceedings of the 25th annual international symposium on Computer architecture
Predictive techniques for aggressive load speculation

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Correlated load-address predictors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Selective value prediction

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Storageless value prediction using prior register values

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
On Table Bandwidth and Its Update Delay for Value Prediction on Wide-Issue ILP Processors

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Static load classification for improving the value predictability of data-cache misses

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Implementation of Hybrid Context Based Value Predictors Using Value Sequence Classification

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Efficacy and Performance Impact of Value Prediction

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Exploring Last n Value Prediction

PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
The Alpha 21264 Microprocessor Architecture

ICCD '98 Proceedings of the International Conference on Computer Design
Hybridizing and Coalescing Load Value Predictors

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Improving context-based load value prediction (instruction-level parallelism)

Improving context-based load value prediction (instruction-level parallelism)

VPC3: a fast and effective trace-compression algorithm

Proceedings of the joint international conference on Measurement and modeling of computer systems
Load elimination for low-power embedded processors

GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Automatic measurement of memory hierarchy parameters

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The VPC Trace-Compression Algorithms

IEEE Transactions on Computers
Improving memory system performance with energy-efficient value speculation

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Speculative trivialization point advancing in high-performance processors

Journal of Systems Architecture: the EUROMICRO Journal
Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction

ACM Transactions on Architecture and Code Optimization (TACO)
Exploiting inter-sequence correlations for program behavior prediction

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Finding the limit: examining the potential and complexity of compilation scheduling for JIT-based runtime systems

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	14.98

Visualization

Abstract

Load instructions diminish processor performance in two ways. First, due to the continuously widening gap between CPU and memory speed, the relative latency of load instructions grows constantly and already slows program execution. Second, memory reads limit the available instruction-level parallelism because instructions that use the result of a load must wait for the memory access to complete before they can start executing. Load-value predictors alleviate both problems by allowing the CPU to speculatively continue processing without having to wait for load instructions, which can significantly improve the execution speed. While several hybrid load-value predictors have been proposed and found to work well, no systematic study of such predictors exists. In this paper, we investigate the performance of all hybrids that can be built out of a register value, a last value, a stride 2-delta, a last four value, and a finite context method predictor. Our analysis shows that hybrids can deliver 25 percent more speedup than the best single-component predictors. An examination of the individual components of hybrids revealed that predictors with a poor standalone performance sometimes make excellent components in a hybrid, while combining well-performing individual predictors often does not result in an effective hybrid. Our hybridization study identified the register value + stride 2-delta predictor as one of the best two-component hybrids. It matches or exceeds the speedup of two-component hybrids from the literature in spite of its substantially smaller and simpler design. Of all the predictors we studied, the register value + stride 2-delta + last four value hybrid performs best. It yields a harmonic-mean speedup over the eight SPECint95 programs of 17.2 percent.