Software transactional memory for dynamic-sized data structures
Proceedings of the twenty-second annual symposium on Principles of distributed computing
McRT-STM: a high performance software transactional memory system for a multi-core runtime
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
McRT-Malloc: a scalable transactional memory allocator
Proceedings of the 5th international symposium on Memory management
Optimizing memory transactions
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Compiler and runtime support for efficient software transactional memory
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 33rd annual international symposium on Computer Architecture
Implicit parallelism with ordered transactions
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Time-based transactional memory with scalable time bases
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
MetaTM/TxLinux: transactional memory for an operating system
Proceedings of the 34th annual international symposium on Computer architecture
Understanding Tradeoffs in Software Transactional Memory
Proceedings of the International Symposium on Code Generation and Optimization
Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language
Proceedings of the International Symposium on Code Generation and Optimization
JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Toward high performance nonblocking software transactional memory
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Dynamic performance tuning of word-based software transactional memory
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
RingSTM: scalable transactions with a single atomic instruction
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Practical weak-atomicity semantics for java stm
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Implementing and Exploiting Inevitability in Software Transactional Memory
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Conflict detection and validation strategies for software transactional memory
DISC'06 Proceedings of the 20th international conference on Distributed Computing
DISC'06 Proceedings of the 20th international conference on Distributed Computing
A lazy snapshot algorithm with eager validation
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Adaptive software transactional memory
DISC'05 Proceedings of the 19th international conference on Distributed Computing
Lightweight, robust adaptivity for software transactional memory
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Extensible software transactional memory
Proceedings of the Third C* Conference on Computer Science and Software Engineering
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Lowering STM overhead with static analysis
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
A transactional memory with automatic performance tuning
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Delegation and nesting in best-effort hardware transactional memory
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Sandboxing transactional memory
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
A transactional runtime system for the Cell/BE architecture
Journal of Parallel and Distributed Computing
Software Transactional Memory for GPU Architectures
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
Most research into high-performance software transactional memory (STM) assumes that transactions will run on a processor with a relatively strict memory model, such as Total Store Ordering (TSO). To execute these algorithms correctly on processors with relaxed memory models, explicit fence instructions may be required on every transactional access, and neither the processor nor the compiler may be able to safely reorder transactional reads. The overheads of fence instructions and read serialization are a significant but unstudied source of latency for STM, with impact on the tradeoffs among different STM systems and on the optimizations that may be possible for any given system. Straightforward ports of STM runtimes from strict to relaxed machines may fail to realize the latter's performance potential. We explore the implementation of STM for machines with relaxed memory consistency using two recent high-performance STM systems. We propose compiler optimizations that can safely eliminate many fence instructions. Using these techniques, we obtain a reduction of up to 89% in the number of fences, and 20% in per-transaction latency, for common transactional benchmarks.