ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Compile-time minimisation of load imbalance in loop nests
ICS '97 Proceedings of the 11th international conference on Supercomputing
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Task selection for a multiscalar processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Improving the performance of speculatively parallel applications on the Hydra CMP
ICS '99 Proceedings of the 13th international conference on Supercomputing
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
The Superthreaded Processor Architecture
IEEE Transactions on Computers
Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor
ICS '01 Proceedings of the 15th international conference on Supercomputing
Exact analysis of the cache behavior of nested loops
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
A general compiler framework for speculative multithreading
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
IEEE Micro
Application Load Imbalance on Parallel Processors
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Compiler Techniques for Concurrent Multithreading with Hardware Speculation Support
LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
TEST: a tracer for extracting speculative threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Using thread-level speculation to simplify manual parallelization
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler support for speculative multithreading architecture with probabilistic points-to analysis
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
In Search of Speculative Thread-Level Parallelism
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Improving Speculative Thread-Level Parallelism Through Module Run-Length Prediction
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Let's Study Whole-Program Cache Behaviour Analytically
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Thread-Spawning Schemes for Speculative Multithreading
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Improving Value Communication for Thread-Level Speculation
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Compiling for the multiscalar architecture
Compiling for the multiscalar architecture
Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A cost-driven compilation framework for speculative parallelization of sequential programs
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
The structure of a compiler for explicit and implicit parallelism
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Automatically tuning parallel and parallelized programs
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Vectorization past dependent branches through speculation
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Integrating profile-driven parallelism detection and machine-learning-based mapping
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Speculative parallelization is a technique that complements automatic compiler parallelization by allowing code sections that cannot be fully analyzed by the compiler to be aggresively executed in parallel.However, while speculative parallelization can potentially deliver significant speedups, several overheads associated with the technique limit these speedups in practice. This paper proposes a novel compiler model of speculative multithreaded execution that can be used to reason about the overheads and expected resulting performance gains, or losses, from speculative parallelization.This model is based on estimating the likely execution duration of threads, properly takes into account the scheduling restrictions of most speculative execution environments, and can include all speculative parallelization overheads.Also, different from heuristics that attempt to qualitatively estimate potentially "good" or "bad" sections for speculative multithreaded execution, this model allows the compiler to estimate the speedup or slowdoen quantitatively.Such quantitative estimate can then be used by the compiler or run-time system to make more complex and educated tradeoff decisions. We use the proposed framework on a number of loops from a collection of SPEC benchmarks that suffer mainly from load imbalance and thread dispatch and commit overheads. Experimental results show that our framework can identify on average 68% of the loops that cause slowdowns and on average 97% of the loops that lead to speedups.In fact, our framework predicts the speedups or slowdowns with an error of less than 20% for an average of 44% of the loops across the benchamrks, and with an error of less than 50% for an average of 84% of the loops.Overall, our framework leads to a performance improvement of 5% on average, and as high as 38%, over a naive approach that attempts to speculatively parallelize all the loops considered.