PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Parallel Programming with Polaris
Computer
TEST: a tracer for extracting speculative threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Compiler support for speculative multithreading architecture with probabilistic points-to analysis
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Min-cut program decomposition for thread-level speculation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
A cost-driven compilation framework for speculative parallelization of sequential programs
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation
Proceedings of the 19th annual international conference on Supercomputing
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
POSH: a TLS compiler that exploits program structure
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Executing Java programs with transactional memory
Science of Computer Programming - Special issue: Synchronization and concurrency in object-oriented languages
Speculative thread decomposition through empirical optimization
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic Trace-Based Parallelization of Java Programs
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Exploiting Postdominance for Speculative Parallelization
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
HydraVM: extracting parallelism from legacy sequential code using STM
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Hi-index | 0.00 |
We describe a framework for trace-based parallelization of recursive Java programs. We also explore and evaluate the feasibility of using a hardware transactional memory (HTM) system to handle dependences. We design, implement, and evaluate a system that takes as input a sequential program, identifies traces on it, and groups these traces into coarse-grain units of computation, or tasks. We then insert code that allows tasks to execute in parallel transactions using a fork/join paradigm. We also present a software algorithm that ensures sequential program order is maintained by transactional memory. We identify the associated issues and describe criteria that are necessary for Java programs to execute successfully on HTM systems. Our evaluation using JOlden benchmarks indicates that the computational phases of the benchmarks can be executed effectively on HTM systems. The average speedup is 2.7 for four processors. We conclude that HTM is a viable solution to dealing with dependences when performing trace-based parallelization.