IEEE Transactions on Computers
Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Digital integrated circuits: a design perspective
Digital integrated circuits: a design perspective
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Confidence estimation for speculation control
Proceedings of the 25th annual international symposium on Computer architecture
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Handling long-latency loads in a simultaneous multithreading processor
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Confidence Estimation for Branch Prediction Reversal
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Exploiting Speculative Thread-Level Parallelism on a SMT Processor
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Boosting SMT Performance by Speculation Control
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Front-End Policies for Improved Issue Efficiency in SMT Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Power-Sensitive Multithreaded Architecture
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Branch Classification to Control Instruction Fetch in Simultaneous Multithreaded Architectures
IWIA '02 Proceedings of the International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'02)
An evaluation of speculative instruction execution on simultaneous multithreaded processors
ACM Transactions on Computer Systems (TOCS)
Branch Prediction and Simultaneous Multithreading
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
The Impact of Resource Partitioning on SMT Processors
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Dynamically Controlled Resource Allocation in SMT Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Learning-Based SMT Processor Resource Distribution via Hill-Climbing
Proceedings of the 33rd annual international symposium on Computer Architecture
The harmonic or geometric mean: does it really matter?
ACM SIGARCH Computer Architecture News
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
The Impact of Resource Sharing Control on the Design of Multicore Processors
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
A survey on hardware-aware and heterogeneous computing on multicore processors and accelerators
Concurrency and Computation: Practice & Experience
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling
Proceedings of the ACM International Conference on Computing Frontiers
Hi-index | 0.00 |
By executing two or more threads concurrently, Simultaneous MultiThreading (SMT) architectures are able to exploit both Instruction-Level Parallelism (ILP) and Thread-Level Parallelism (TLP) from the increased number of in-flight instructions that are fetched from multiple threads. However, due to incorrect control speculations, a significant number of these in-flight instructions are discarded from the pipelines of SMT processors (which is a direct consequence of these pipelines getting wider and deeper). Although increasing the accuracy of branch predictors may reduce the number of instructions so discarded from the pipelines, the prediction accuracy cannot be easily scaled up since aggressive branch prediction schemes strongly depend on the particular predictability inherently to the application programs. In this paper, we present an efficient thread scheduling mechanism for SMT processors, called SAFE-T (Speculation-Aware Front-End Throttling): it is easy to implement and allows an SMT processor to selectively perform speculative execution of threads according to the confidence level on branch predictions, hence preventing wrong-path instructions from being fetched. SAFE-T provides an average reduction of 57.9% in the number of discarded instructions and improves the instructions per cycle (IPC) performance by 14.7% on average over the ICOUNT policy across the multi-programmed workloads we simulate.