Improving Value Communication for Thread-Level Speculation

Authors:
J. Gregory Steffan;Christopher B. Colohan;Antonia Zhai;Todd C. Mowry
Affiliations:
-;-;-;-
Venue:
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Year:
2002

Citing 0
Cited 36

Temporally silent stores

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Compiler optimization of scalar value communication between speculative threads

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
TEST: a tracer for extracting speculative threads

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Using thread-level speculation to simplify manual parallelization

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Toward efficient and robust software speculative parallelization on multiprocessors

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Jrpm system for dynamically parallelizing Java programs

Proceedings of the 30th annual international symposium on Computer architecture
Thread Partitioning and Value Prediction for Exploiting Speculative Thread-Level Parallelism

IEEE Transactions on Computers
Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Decoupled Software Pipelining with the Synchronization Array

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Compiler Estimation of Load Imbalance Overhead in Speculative Parallelization

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Reducing misspeculation overhead for module-level speculative execution

Proceedings of the 2nd conference on Computing frontiers
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Exposing speculative thread parallelism in SPEC2000

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
The STAMPede approach to thread-level speculation

ACM Transactions on Computer Systems (TOCS)
Thread-Level Speculation on a CMP can be energy efficient

Proceedings of the 19th annual international conference on Supercomputing
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Tolerating Dependences Between Large Speculative Threads Via Sub-Threads

Proceedings of the 33rd annual international symposium on Computer Architecture
Implicit parallelism with ordered transactions

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hardware support for software controlled multithreading

ACM SIGARCH Computer Architecture News
A compiler cost model for speculative parallelization

ACM Transactions on Architecture and Code Optimization (TACO)
Incrementally parallelizing database transactions with thread-level speculation

ACM Transactions on Computer Systems (TOCS)
Spice: speculative parallel iteration chunk execution

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Compiler and hardware support for reducing the synchronization of speculative threads

ACM Transactions on Architecture and Code Optimization (TACO)
Extending concurrency of transactional memory programs by using value prediction

Proceedings of the 6th ACM conference on Computing frontiers
Speculative parallelization of multipath radiosity algorithm

SPECTS'09 Proceedings of the 12th international conference on Symposium on Performance Evaluation of Computer & Telecommunication Systems
Speculation for Parallelizing Runtime Checks

SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Safe programmable speculative parallelism

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
RETCON: transactional repair without replay

Proceedings of the 37th annual international symposium on Computer architecture
Transactional conflict decoupling and value prediction

Proceedings of the international conference on Supercomputing
Loop selection for thread-level speculation

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction

ACM Transactions on Architecture and Code Optimization (TACO)
Exploiting thread-level speculative parallelism with software value prediction

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Distributed replay protocol for distributed uniprocessors

Proceedings of the 26th ACM international conference on Supercomputing
Disjoint out-of-order execution processor

ACM Transactions on Architecture and Code Optimization (TACO)
Practical speculative parallelization of variable-length decompression algorithms

Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
HPar: A practical parallel parser for HTML--taming HTML complexities for parallel parsing

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Thread-Level Speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. In this paper, we show that the key to good performance lies in the three different ways to communicate a value between speculative threads: speculation, synchronization, and prediction. The difficult part is deciding how and when to apply each method.This paper shows how we can apply value prediction, dynamic synchronization, and hardware instruction prioritization to improve value communication and hence performance in several SPECint benchmarks that have been automatically-transformed by our compiler to exploit TLS. We find that value prediction can be effective when properly throttled to avoid the high costs of misprediction, while most of the gains of value prediction can be more easily achieved by exploiting silent stores. We also show that dynamic synchronization is quite effective for most benchmarks, while hardware instruction prioritization is not. Overall, we find that these techniques have great potential for improving the performance of TLS.