MASA: a multithreaded processor architecture for parallel symbolic computing
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
ICS '98 Proceedings of the 12th international conference on Supercomputing
Task selection for a multiscalar processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Speculative Alias Analysis for Executable Code
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
A compiler framework for speculative analysis and optimizations
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Compiler support for speculative multithreading architecture with probabilistic points-to analysis
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Control Speculation in Multithreaded Processors through Dynamic Loop Detection
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
A Programmable Co-processor for Profiling
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Min-cut program decomposition for thread-level speculation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
A cost-driven compilation framework for speculative parallelization of sequential programs
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation
Proceedings of the 19th annual international conference on Supercomputing
POSH: a TLS compiler that exploits program structure
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Implicit parallelism with ordered transactions
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Speculative thread decomposition through empirical optimization
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Tight analysis of the performance potential of thread speculation using spec CPU 2006
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Modeling optimistic concurrency using quantitative dependence analysis
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Estimating and exploiting potential parallelism by source-level dependence profiling
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
SD3: A Scalable Approach to Dynamic Data-Dependence Profiling
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
How many threads to spawn during program multithreading?
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Kremlin: rethinking and rebooting gprof for the multicore age
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Programmer-assisted automatic parallelization
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Dynamic trace-based analysis of vectorization potential of applications
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Efficient and accurate data dependence profiling using software signatures
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Fast loop-level data dependence profiling
Proceedings of the 26th ACM international conference on Supercomputing
Multi-slicing: a compiler-supported parallel approach to data dependence profiling
Proceedings of the 2012 International Symposium on Software Testing and Analysis
HydraVM: extracting parallelism from legacy sequential code using STM
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
CUBIT: compact bitmap profiling for dynamic data dependence analysis
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential
ACM Transactions on Architecture and Code Optimization (TACO)
Integrating profile-driven parallelism detection and machine-learning-based mapping
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
As hardware systems move toward multicore and multithreaded architectures, programmers increasingly rely on automated tools to help with both the parallelization of legacy codes and effective exploitation of all available hardware resources. Thread-level speculation (TLS) has been proposed as a technique to parallelize the execution of serial codes or serial sections of parallel codes. One of the key aspects of TLS is task selection for speculative execution. In this paper we propose a cost model for compiler-driven task selection for TLS. The model employs profile-based analysis of may -dependences to estimate the probability of successful speculation. We discuss two techniques to eliminate potential inter-task dependences, thereby improving the rate of successful speculation. We also present a profiling tool, DProf, that is used to provide run-time information about may -dependences to the compiler and map dynamic dependences to the source code. This information is also made available to the programmer to assist in code rewriting and/or algorithm redesign. We used DProf to quantify the potential of this approach and we present results on selected applications from the SPEC CPU2006 and SEQUOIA benchmarks.