Branch history table prediction of moving target branches due to subroutine returns
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Path-based next trace prediction
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
International Journal of Parallel Programming
Improving prediction for procedure returns with return-address-stack repair mechanisms
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A Trace Cache Microarchitecture and Evaluation
IEEE Transactions on Computers - Special issue on cache memory and related problems
Return-Address Prediction in Speculative Multithreaded Environments
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Branch Prediction and Simultaneous Multithreading
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Trace cache design for wide-issue superscalar processors
Trace cache design for wide-issue superscalar processors
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Hi-index | 0.00 |
This paper discusses the effects of the prediction of return addresses in high-performance processors designed with trace caches. We show that a traditional return address stack used in such a processor predicts return addresses poorly if a trace cache line contains a function call and a return. This situation can often be observed for processors demanding aggressive instruction fetch bandwidth. Thus, we propose two potential schemes to improve the prediction accuracy of return addresses. We demonstrate that the proposed schemes increase the return address prediction rates reasonably using minimal hardware support. We also analyze the effects of various trace cache configurations on the return address prediction accuracy such as trace cache set associativity, cache size and line size. Our experimental results show that the average return address prediction accuracy across several benchmarks can be up to 11% better than a traditional return address stack in a high-performance processor with a trace cache.