Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Speculative execution model with duplication
ICS '98 Proceedings of the 12th international conference on Supercomputing
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
SUIF Explorer: an interactive and interprocedural parallelizer
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Improving the performance of speculatively parallel applications on the Hydra CMP
ICS '99 Proceedings of the 13th international conference on Supercomputing
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
Control independence in trace processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Slipstream processors: improving both performance and fault tolerance
ACM SIGPLAN Notices
Architecture of the Atlas Chip-Multiprocessor: Dynamically Parallelizing Irregular Applications
IEEE Transactions on Computers
Time Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences
IEEE Transactions on Parallel and Distributed Systems
Slipstream processors: improving both performance and fault tolerance
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
IEEE Transactions on Parallel and Distributed Systems
Techniques for speculative run-time parallelization of loops
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Containers on the Parallelization of General-Purpose Java Programs
International Journal of Parallel Programming
Computer
Dynamic Parallel media processing using Speculative Broadcast Loop (SBL)
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Automatic Parallelization of C by Means of Language Transcription
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Limits of Task-Based Parallelism in Irregular Applications
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Compiler support for speculative multithreading architecture with probabilistic points-to analysis
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Exploring Microprocessor Architectures for Gigascale Integration
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Multiple-path execution for chip multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
A theory of nested speculative execution
COORDINATION'07 Proceedings of the 9th international conference on Coordination models and languages
A pattern language for parallelizing irregular algorithms
Proceedings of the 2010 Workshop on Parallel Programming Patterns
IBM Blue Gene/Q memory subsystem with speculative execution and transactional memory
IBM Journal of Research and Development
Hi-index | 0.01 |
Thread-level speculation (TLS) makes it possible to parallelize general purpose C programs. This paper proposes software and hardware mechanisms that support speculative thread- level execution on a single-chip multiprocessor. A detailed analysis of programs using the TLS execution model shows a bound on the performance of a TLS machine that is promising. In particular, TLS makes it feasible to find speculative do across parallelism in outer loops that can greatly improve the performance of general-purpose applications. Exploiting speculative thread-level parallelism on a multiprocessor requires the compiler to determine where to speculate, and to generate SPMD (single program multiple data) code.We have developed a fully automatic compiler system that uses profile information to determine the best loops to execute speculatively, and to generate the synchronization code that improves the performance of speculative execution. The hardware mechanisms required to support speculation are simple extensions to the cache hierarchy of a single chip multiprocessor. We show that with our proposed mechanisms, thread-level speculation provides significant performance benefits.