A comparison of dynamic branch predictors that use two levels of branch history
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The agree predictor: a mechanism for reducing negative branch history interference
Proceedings of the 24th annual international symposium on Computer architecture
The predictability of data values
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Highly accurate data value prediction using hybrid predictors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The effect of instruction fetch bandwidth on value prediction
Proceedings of the 25th annual international symposium on Computer architecture
Predictive techniques for aggressive load speculation
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Correlated load-address predictors
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Storageless value prediction using prior register values
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Implementation of Hybrid Context Based Value Predictors Using Value Sequence Classification
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Efficacy and Performance Impact of Value Prediction
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Exploring Last n Value Prediction
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
The Alpha 21264 Microprocessor Architecture
ICCD '98 Proceedings of the International Conference on Computer Design
Improving context-based load value prediction (instruction-level parallelism)
Improving context-based load value prediction (instruction-level parallelism)
Static load classification for improving the value predictability of data-cache misses
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Latency and energy aware value prediction for high-frequency processors
ICS '02 Proceedings of the 16th international conference on Supercomputing
IEEE Transactions on Computers
Low-Cost Value Predictors Using Frequent Value Locality
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
On the energy-efficiency of speculative hardware
Proceedings of the 2nd conference on Computing frontiers
Improving memory system performance with energy-efficient value speculation
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Zero loads: canceling load requests by tracking zero values
Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
Hi-index | 0.00 |
Most well-performing load value predictors are hybrids that combine multiple predictors. Such hybrids are often large. To reduce their size and to improve their performance, this paper presents two storage reduction techniques as well as a detailed analysis of the interaction between a hybrid's components. We found that state sharing and simple value compression can shrink the size of a predictor by a factor of two without compromising the performance. Our component analysis revealed that combining well-performing predictors does not always yield a good hybrid, whereas sometimes a poor predictor can make an excellent complement to another predictor in a hybrid.Performance evaluations using a cycle-accurate simulator running SPECint95 show that hybridizing can improve non-hybrids by thirty to fifty percent over a wide range of sizes. With fifteen kilobytes of state, our coalesced-hybrid yields a harmonic mean speedup of twelve and fifteen percent with a re-fetch and a re-execute mis-prediction recovery mechanism, respectively, which is higher than the speedup of other predictors we evaluate, some of which are six times larger.