ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Timestamp representations for virtual sequences
Proceedings of the eleventh workshop on Parallel and distributed simulation
Task selection for a multiscalar processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler Techniques for the Superthreaded Architectures
International Journal of Parallel Programming
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
The Superthreaded Processor Architecture
IEEE Transactions on Computers
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
A general compiler framework for speculative multithreading
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Compiler support for speculative multithreading architecture with probabilistic points-to analysis
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Exploiting Method-Level Parallelism in Single-Threaded Java Programs
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
In Search of Speculative Thread-Level Parallelism
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
A Quantitative Assessment of Thread-Level Speculation Techniques
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Chip multiprocessors with speculative multithreading: design for performance and energy efficiency
Chip multiprocessors with speculative multithreading: design for performance and energy efficiency
Thread-Level Speculation on a CMP can be energy efficient
Proceedings of the 19th annual international conference on Supercomputing
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
SableSpMT: a software framework for analysing speculative multithreading in Java
PASTE '05 Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Energy-Efficient Thread-Level Speculation
IEEE Micro
POSH: a TLS compiler that exploits program structure
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Bulk Disambiguation of Speculative Threads in Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 33rd annual international symposium on Computer Architecture
Implicit parallelism with ordered transactions
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Tight analysis of the performance potential of thread speculation using spec CPU 2006
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread
Microprocessors & Microsystems
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Compiler-Driven Dependence Profiling to Guide Program Parallelization
Languages and Compilers for Parallel Computing
Dynamic parallelization of single-threaded binary programs using speculative slicing
Proceedings of the 23rd international conference on Supercomputing
Combining thread level speculation helper threads and runahead execution
Proceedings of the 23rd international conference on Supercomputing
PPPJ '09 Proceedings of the 7th International Conference on Principles and Practice of Programming in Java
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Bandwidth guaranteed multicast scheduling for virtual output queued packet switches
Journal of Parallel and Distributed Computing
Exploitation of nested thread-level speculative parallelism on multi-core systems
Proceedings of the 7th ACM international conference on Computing frontiers
International Journal of Parallel Programming
Loop selection for thread-level speculation
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Software thread level speculation for the java language and virtual machine environment
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
HiRe: using hint & release to improve synchronization of speculative threads
Proceedings of the 26th ACM international conference on Supercomputing
Dynamically dispatching speculative threads to improve sequential execution
ACM Transactions on Architecture and Code Optimization (TACO)
Mixed speculative multithreaded execution models
ACM Transactions on Architecture and Code Optimization (TACO)
IBM Blue Gene/Q memory subsystem with speculative execution and transactional memory
IBM Journal of Research and Development
Hi-index | 0.00 |
Chip Multiprocessors (CMPs) are flexible, high-frequency platforms on which to support Thread-Level Speculation (TLS). However, for TLS to deliver on its promise, CMPs must exploit multiple sources of speculative task-level parallelism, including any nesting levels of both subroutines and loop iterations. Unfortunately, these environments are hard to support in decentralized CMP hardware: since tasks are spawned out-of-order and unpredictably, maintaining key TLS basics such as task ordering and efficient resource allocation is challenging.While the concept of out-of-order spawning is not new, this paper is the first to propose a set of microarchitectural mechanisms that, altogether, fundamentally enable fast TLS with out-of-order spawn in a CMP. Moreover, we develop a fully-automated TLS compiler for aggressive out-of-order spawn. With our mechanisms, a TLS CMP with four 4-issue cores achieves an average speedup of 1.30 for full SPECint 2000 applications; the corresponding speedup for in-order only spawn is 1.04. Overall, our mechanisms unlock the potential of TLS for the toughest applications.