An evaluation of speculative instruction execution on simultaneous multithreaded processors

Authors:
Steven Swanson;Luke K. McDowell;Michael M. Swift;Susan J. Eggers;Henry M. Levy
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
2003

Citing 19
Cited 4

Cache performance of operating system and multiprogramming workloads

ACM Transactions on Computer Systems (TOCS)
Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An analysis of dynamic branch prediction schemes on system workloads

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Assigning confidence to conditional branch predictions

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

ACM Transactions on Computer Systems (TOCS)
Tuning compiler optimizations for simultaneous multithreading

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Confidence estimation for speculation control

Proceedings of the 25th annual international symposium on Computer architecture
Pipeline gating: speculation control for energy reduction

Proceedings of the 25th annual international symposium on Computer architecture
Threaded multiple path execution

Proceedings of the 25th annual international symposium on Computer architecture
Instruction fetch mechanisms for multipath execution processors

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Performance analysis of the Alpha 21264-based Compaq ES40 system

Proceedings of the 27th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
An analysis of operating system behavior on a simultaneous multithreaded architecture

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Complete Computer System Simulation: The SimOS Approach

IEEE Parallel & Distributed Technology: Systems & Technology
Power-Sensitive Multithreaded Architecture

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Branch Prediction and Simultaneous Multithreading

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
A comparison of the effect of branch prediction on multithreaded and scalar architectures

ACM SIGARCH Computer Architecture News

WaveScalar

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Multithreading extension for Thumb ISA and decoder support

EHAC'06 Proceedings of the 5th WSEAS International Conference on Electronics, Hardware, Wireless and Optical Communications
The impact of speculative execution on SMT processors

International Journal of Parallel Programming
A bypass mechanism to enhance branch predictor for SMT processors

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern superscalar processors rely heavily on speculative execution for performance. For example, our measurements show that on a 6-issue superscalar, 93% of committed instructions for SPECINT95 are speculative. Without speculation, processor resources on such machines would be largely idle. In contrast to superscalars, simultaneous multithreaded (SMT) processors achieve high resource utilization by issuing instructions from multiple threads every cycle. An SMT processor thus has two means of hiding latency: speculation and multithreaded execution. However, these two techniques may conflict; on an SMT processor, wrong-path speculative instructions from one thread may compete with and displace useful instructions from another thread. For this reason, it is important to understand the trade-offs between these two latency-hiding techniques, and to ask whether multithreaded processors should speculate differently than conventional superscalars.This paper evaluates the behavior of instruction speculation on SMT processors using both multiprogrammed (SPECINT and SPECFP) and multithreaded (the Apache Web server) workloads. We measure and analyze the impact of speculation and demonstrate how speculation on an 8-context SMT differs from superscalar speculation. We also examine the effect of speculation-aware fetch and branch prediction policies in the processor. Our results quantify the extent to which (1) speculation is critical to performance on a multithreaded processor because it ensures an ample supply of parallelism to feed the functional units, and (2) SMT actually enhances the effectiveness of speculative execution, compared to a superscalar processor by reducing the impact of branch misprediction. Finally, we quantify the impact of both hardware configuration and workload characteristics on speculation's usefulness and demonstrate that, in nearly all cases, speculation is beneficial to SMT performance.