A Minimal Dual-Core Speculative Multi-Threading Architecture

Authors:
Srikanth T. Srinivasan;Haitham Akkary;Tom Holman;Konrad Lai
Affiliations:
Intel Corporation;Intel Corporation;Intel Corporation;Intel Corporation
Venue:
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Year:
2004

Citing 0
Cited 6

Efficient emulation of hardware prefetchers via event-driven helper threading

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Future execution: A prefetching mechanism that uses multiple cores to speed up single threads

ACM Transactions on Architecture and Code Optimization (TACO)
Accurate branch prediction for short threads

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
On the potential of latency tolerant execution in speculative multithreading

IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Speculative-aware execution: a simple and efficient technique for utilizing multi-cores to improve single-thread performance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Disjoint out-of-order execution processor

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speculative Multi-Threading (SpMT) can improve single-threaded application performance using the multiple thread contexts available in current processors. We propose a minimal SpMT model that uses only two thread contexts. The model achieves significant speedups for single-threaded applications using a low-overhead scheme for detecting and selectively recovering from data dependence violations, and a novel Wrong Path Predictor to reduce the number of speculative threads executing along the wrong path. We also study the interactions between three previously proposed SpMT thread spawning policies that can be implemented dynamically in hardware - Forkon Call, Loop Continuation and Run Ahead policies - and show it is beneficial to implement all three policies together in a processor. While the individual thread spawning policies show performance benefits of 14%, 5% and 4% respectively on our SpMT model over a base processor that does not exploit SpMT, combining all three policies shows an average performance gain of 20%. Finally, we identify the sources of SpMT benefits - on average, 58% of the performance benefits due to SpMT comes from cache prefetching, 33% from instruction reuse, and 9% from branch precomputation and show all three sources of SpMT benefits must be utilized to realize the full potential of SpMT.