Supporting speculative multithreading on simultaneous multithreaded processors

Authors:
Venkatesan Packirisamy;Shengyue Wang;Antonia Zhai;Wei-Chung Hsu;Pen-Chung Yew
Affiliations:
Department of Computer Science, University of Minnesota, Minneapolis;Department of Computer Science, University of Minnesota, Minneapolis;Department of Computer Science, University of Minnesota, Minneapolis;Department of Computer Science, University of Minnesota, Minneapolis;Department of Computer Science, University of Minnesota, Minneapolis
Venue:
HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Year:
2006

Citing 14
Cited 5

ARB: A Hardware Mechanism for Dynamic Reordering of Memory References

IEEE Transactions on Computers
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A dynamic multithreading processor

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A Chip-Multiprocessor Architecture with Speculative Multithreading

IEEE Transactions on Computers
A scalable approach to thread-level speculation

Proceedings of the 27th annual international symposium on Computer architecture
Speculative Versioning Cache

IEEE Transactions on Parallel and Distributed Systems
Compiler optimization of scalar value communication between speculative threads

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Exploiting Speculative Thread-Level Parallelism on a SMT Processor

HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Implicitly-multithreaded processors

Proceedings of the 30th annual international symposium on Computer architecture
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The STAMPede approach to thread-level speculation

ACM Transactions on Computer Systems (TOCS)
Tolerating Dependences Between Large Speculative Threads Via Sub-Threads

Proceedings of the 33rd annual international symposium on Computer Architecture
Loop selection for thread-level speculation

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing

An Evaluation of Misaligned Data Access Handling Mechanisms in Dynamic Binary Translation Systems

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Energy efficient speculative threads: dynamic thread allocation in Same-ISA heterogeneous multicore systems

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Efficient and effective misaligned data access handling in a dynamic binary translation system

ACM Transactions on Architecture and Code Optimization (TACO)
SCIN-cache: Fast speculative versioning in multithreaded cores

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
The design and implementation of heterogeneous multicore systems for energy-efficient speculative thread execution

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speculative multithreading is a technique that has been used to improve single thread performance. Speculative multithreading architectures for Chip multiprocessors (CMPs) have been extensively studied. But there have been relatively few studies on the design of speculative multithreading for simultaneous multithreading (SMT) processors. The current SMT based designs – IMT [9] and DMT [2] use load/store queue (LSQ) to perform dependence checking. Since the size of the LSQ is limited, this design is suitable only for small threads. In this paper we present a novel cache-based architecture support for speculative simultaneous multithreading which can efficiently handle larger threads. In our architecture, the associativity in the cache is used to buffer speculative values. Our 4-thread architecture can achieve about 15% speedup when compared to the equivalent superscalar processors and about 3% speedup on the average over the LSQ-based architectures, however, with a less complex hardware. Also our scheme can perform 14% better than the LSQ-based scheme for larger threads.