Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Task selection for a multiscalar processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
Implementation of precise interrupts in pipelined processors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
A general compiler framework for speculative multithreading
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
FX!32: A Profile-Directed Binary Translator
IEEE Micro
IEEE Micro
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
Value-driven redundancy elimination
Value-driven redundancy elimination
A cost-driven compilation framework for speculative parallelization of sequential programs
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Automatic Thread Extraction with Decoupled Software Pipelining
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
SableSpMT: a software framework for analysing speculative multithreading in Java
PASTE '05 Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
POSH: a TLS compiler that exploits program structure
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Speculative thread decomposition through empirical optimization
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Tight analysis of the performance potential of thread speculation using spec CPU 2006
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hardware atomicity for reliable software speculation
Proceedings of the 34th annual international symposium on Computer architecture
Fast Track: A Software System for Speculative Program Optimization
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Dynamic performance tuning for speculative threads
Proceedings of the 36th annual international symposium on Computer architecture
Polyhedral parallelization of binary code
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Limits of region-based dynamic binary parallelization
Proceedings of the 9th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Hi-index | 0.00 |
We present Runtime Automatic Speculative Parallelization (RASP), a technique for the dynamic extraction of speculative threads from a running application in a user-transparent fashion. By leveraging the idle cores in a CMP to analyze, optimize, and participate in the execution of a running sequential program, RASP enables a collection of simpler cores to achieve sequential performance on par with a significantly more complex core. In contrast to other systems for automatic speculative parallelization, RASP uses dynamic binary translation to optimize applications on-the-fly, without any need for recompilation or source code. RASP achieves these speedups without relying on special-purpose hardware support; RASP's dynamic profiling uses a clever variation on conventional performance monitoring, while RASP's speculative execution relies on the same simple hardware support for speculation that has been proposed for simplifying parallel programming. On a simulated cluster of four in-order cores, RASP accelerates SPEC2006 integer benchmarks by an average of 49%, with promising results for scientific and multimedia workloads as well.