Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
The Impact of Resource Partitioning on SMT Processors
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Dynamically Controlled Resource Allocation in SMT Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Learning-Based SMT Processor Resource Distribution via Hill-Climbing
Proceedings of the 33rd annual international symposium on Computer Architecture
Computer Architecture, Fourth Edition: A Quantitative Approach
Computer Architecture, Fourth Edition: A Quantitative Approach
IEEE Micro
An adaptive resource partitioning algorithm for SMT processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
The impact of speculative execution on SMT processors
International Journal of Parallel Programming
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling
Proceedings of the ACM International Conference on Computing Frontiers
Hi-index | 0.00 |
One major obstacle faced by designers when entering the multicore era is how to harness the massive computing power which these cores provide. Since Instructional-Level Parallelism (ILP) is inherently limited, one single thread is not capable of efficiently utilizing the resource of a single core. Hence, Simultaneous MultiThreading (SMT) microarchitecture can be introduced in an effort to achieve improved system resource utilization and a correspondingly higher instruction throughput through the exploitation of Thread-Level Parallelism (TLP) as well as ILP. However, when multiple threads execute concurrently in a single core, they automatically compete for system resources. Our research shows that, without control over the number of entries each thread can occupy in system resources like instruction fetch queue and/or reorder buffer, a scenario called "mutual-hindrance" execution takes place. Conversely, introducing active resource sharing control mechanisms causes the opposite situation ("mutual-benefit" execution), with a possible significant performance improvement and lower cache miss frequency. This demonstrates that active resource sharing control is essential for future multicore multithreading microprocessor design.