Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
On the Automatic Parallelization of the Perfect Benchmarks®
IEEE Transactions on Parallel and Distributed Systems
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
The Superthreaded Processor Architecture
IEEE Transactions on Computers
An architecture for mostly functional languages
LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
Removing architectural bottlenecks to the scalability of speculative parallelization
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Techniques for speculative run-time parallelization of loops
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Parallel Programming with Polaris
Computer
ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Hardware for Speculative Parallelization of Partially-Parallel Loops in DSM Multiprocessors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors
Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors
ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes
Proceedings of the 30th annual international symposium on Computer architecture
Min-cut program decomposition for thread-level speculation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Compiler Estimation of Load Imbalance Overhead in Speculative Parallelization
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
The STAMPede approach to thread-level speculation
ACM Transactions on Computer Systems (TOCS)
Thread-Level Speculation on a CMP can be energy efficient
Proceedings of the 19th annual international conference on Supercomputing
Characterization of TCC on Chip-Multiprocessors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Energy-Efficient Thread-Level Speculation
IEEE Micro
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Tolerating Dependences Between Large Speculative Threads Via Sub-Threads
Proceedings of the 33rd annual international symposium on Computer Architecture
Executing Java programs with transactional memory
Science of Computer Programming - Special issue: Synchronization and concurrency in object-oriented languages
Speculative thread decomposition through empirical optimization
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Incrementally parallelizing database transactions with thread-level speculation
ACM Transactions on Computer Systems (TOCS)
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
Parallelization of utility programs based on behavior phase analysis
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Runtime automatic speculative parallelization
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
Thread-level speculation provides architectural support to aggressively run hard-to-analyze code in parallel. As speculative tasks run concurrently, they generate unsafe or speculative memory state that needs to be separately buffered and managed in the presence of distributed caches and buffers. Such state may contain multiple versions of the same variable.In this paper, we introduce a novel taxonomy of approaches to buffer and manage multi-version speculative memory state in multiprocessors. We also present a detailed complexity-benefit trade-off analysis of the different approaches. Finally, we use numerical applications to evaluate the performance of the approaches under a single architectural framework. Our key insights are that support for buffering the state of multiple speculative tasks and versions per processor is more complexity-effective than support for merging the state of tasks with main memory lazily. Moreover, both supports can be gainfully combined and, in large machines, their effect is nearly fully additive. Finally, the more complex support for future state in main memory can boost performance when buffers are under pressure, but hurts performance when squashes are frequent.