ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ICS '98 Proceedings of the 12th international conference on Supercomputing
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
SUIF Explorer: an interactive and interprocedural parallelizer
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
In Search of Speculative Thread-Level Parallelism
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Hi-index | 0.00 |
General-purpose computing is taking an irreversible step toward on-chip parallel architectures. One way to enhance the performance of chip multiprocessors is the use of thread-level speculation (TLS). Identifying the points where the speculative threads will be spawned becomes one of the critical issues of this kind of architectures. In this paper, a criterion for selecting the region to be speculatively executed is presented to identify potential sources of speculative parallelism in general-purpose programs. A dynamic profiling method has been provided to search a large space of TLS parallelization schemes and where parallelism was located within the application. We analyze key factors impacting speculative thread-level parallelism of SPEC CPU2000, evaluate whether a given application or parts of it are suitable for TLS technology, and study how to balance thread partition for efficiently exploiting speculative thread-level parallelism. It shows that the inter-thread data dependences are ubiquitous and the synchronization mechanism is necessary; Return value prediction and loop unrolling are important to improve performance. The information we got can be used to guide the thread partition of TLS.