I-structures: data structures for parallel computing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
Dynamic frequency and voltage control for a multiple clock domain microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Hardware for Speculative Parallelization of Partially-Parallel Loops in DSM Multiprocessors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Superscalar Execution with Direct Data Forwarding
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Soft Real- Time Scheduling on Simultaneous Multithreaded Processors
RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
In this paper we provide both a qualitative and a quantitative evaluation of a decoupled multithreaded architecture that uses non-blocking threads. Our architecture is based on simple in-order pipelines and complete decoupling of memory accesses from execution pipelines. We extend the architecture to support thread level speculation using snooping cache coherency protocols. We evaluate the performance gains from speculations by varying the number of load/store instructions compared to computational instructions, miss speculation rates and the degree of thread level speculation. Our architecture presents a viable alternative to complex superscalar and super-speculative CPUs.