PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Register promotion in C programs
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Global optimization by suppression of partial redundancies
Communications of the ACM
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Partial Redundancy Elimination for Access Path Expressions
Proceedings of the Workshop on Object-Oriented Technology
Unified Analysis of Array and Object References in Strongly Typed Languages
SAS '00 Proceedings of the 7th International Symposium on Static Analysis
Optimizing memory transactions
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Compiler and runtime support for efficient software transactional memory
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Memory models for open-nested transactions
Proceedings of the 2006 workshop on Memory system performance and correctness
Enforcing isolation and ordering in STM
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Subtleties of Transactional Memory Atomicity Semantics
IEEE Computer Architecture Letters
Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language
Proceedings of the International Symposium on Code Generation and Optimization
Software transactional memory: why is it only a research toy?
Communications of the ACM - Remembering Jim Gray
Design and implementation of transactional constructs for C/C++
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Reducing Memory Ordering Overheads in Software Transactional Memory
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Reducing STM overhead with access permissions
International Workshop on Aliasing, Confinement and Ownership in Object-Oriented Programming
Optimizing transactions for captured memory
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Interprocedural Load Elimination for Dynamic Optimization of Parallel Programs
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Language support and compiler optimizations for STM and transactional boosting
ICDCIT'07 Proceedings of the 4th international conference on Distributed computing and internet technology
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Safe nondeterminism in a deterministic-by-default parallel language
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
STM with transparent API considered harmful
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
Efficient support for in-place metadata in transactional memory
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Runtime elision of transactional barriers for captured memory
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Profile-guided transaction coalescing—lowering transactional overheads by merging transactions
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Software Transactional Memory (STM) compilers commonly instrument memory accesses by transforming them into calls to STM library functions. Done naïvely, this instrumentation imposes a large overhead, slowing down the transaction execution. Many compiler optimizations have been proposed in an attempt to lower this overhead. In this paper we attempt to drive the STM overhead lower by discovering sources of sub-optimal instrumentation, and providing optimizations to eliminate them. The sources are: (1) redundant reads of memory locations that have been read before, (2) redundant writes to memory locations that will be subsequently written to, (3) redundant writeset lookups of memory locations that have not been written to, and (4) redundant writeset record-keeping for memory locations that will not be read. We describe how static analysis and code motion algorithms can detect these sources, and enable compile-time optimizations that significantly reduce the instrumentation overhead in many common cases. We implement the optimizations over a TL2 Java-based STM system, and demonstrate the effectiveness of the optimizations on various benchmarks, measuring up to 29-50% speedup in a single-threaded run, and up to 19% increased throughput in a 32-threads run.