Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
TEST: a tracer for extracting speculative threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Using thread-level speculation to simplify manual parallelization
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Toward efficient and robust software speculative parallelization on multiprocessors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
Thread Partitioning and Value Prediction for Exploiting Speculative Thread-Level Parallelism
IEEE Transactions on Computers
Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Compiler Estimation of Load Imbalance Overhead in Speculative Parallelization
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Reducing misspeculation overhead for module-level speculative execution
Proceedings of the 2nd conference on Computing frontiers
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Exposing speculative thread parallelism in SPEC2000
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
The STAMPede approach to thread-level speculation
ACM Transactions on Computer Systems (TOCS)
Tolerating Dependences Between Large Speculative Threads Via Sub-Threads
Proceedings of the 33rd annual international symposium on Computer Architecture
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Journal of Parallel and Distributed Computing
Spice: speculative parallel iteration chunk execution
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Compiler and hardware support for reducing the synchronization of speculative threads
ACM Transactions on Architecture and Code Optimization (TACO)
Languages and Compilers for Parallel Computing
Extending concurrency of transactional memory programs by using value prediction
Proceedings of the 6th ACM conference on Computing frontiers
Dynamic performance tuning for speculative threads
Proceedings of the 36th annual international symposium on Computer architecture
Exploiting speculative thread-level parallelism in data compression applications
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Safe programmable speculative parallelism
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
RETCON: transactional repair without replay
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Transactional conflict decoupling and value prediction
Proceedings of the international conference on Supercomputing
Proceedings of the 38th annual international symposium on Computer architecture
Exploiting thread-level speculative parallelism with software value prediction
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
HiRe: using hint & release to improve synchronization of speculative threads
Proceedings of the 26th ACM international conference on Supercomputing
BlockChop: dynamic squash elimination for hybrid processor architecture
Proceedings of the 39th Annual International Symposium on Computer Architecture
Dynamically dispatching speculative threads to improve sequential execution
ACM Transactions on Architecture and Code Optimization (TACO)
Disjoint out-of-order execution processor
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.01 |
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggressively executed in parallel. If the hardware detects a cross-thread dependence violation, it squashes offending threads and resumes execution. Unfortunately, frequent squashing cripples performance.This paper proposes a new framework of hardware mechanisms to eliminate most squashes due to data dependences in multiprocessors. The framework works by learning and predicting violations, and applying delayed disambiguation, value prediction, and stall and release. The framework is suited for directory-based multiprocessors that track memory accesses at the system level with the coarse granularity of memory lines. Simulations of a 16-processor machine show that the framework is very effective. By adding our framework to a speculative CC-NUMA with 64-byte memory lines, we speed-up applications by an average of 4.3 times. Moreover, the resulting system is even 23% faster than a machine that tracks memory accesses at the fine granularity of words - a sophisticated system that is not compatible with mainstream cache coherence protocols.