Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
The synergy between non-blocking synchronization and operating system structure
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Early experience with a commercial hardware transactional memory implementation
Early experience with a commercial hardware transactional memory implementation
Simplifying concurrent algorithms by exploiting hardware transactional memory
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
STM in the small: trading generality for performance in software transactional memory
Proceedings of the 7th ACM european conference on Computer Systems
Opportunities and pitfalls of multi-core scaling using hardware transaction memory
Proceedings of the 4th Asia-Pacific Workshop on Systems
Hi-index | 0.00 |
Designing correct synchronization algorithms is notoriously difficult, as evidenced by a bug we have identified that has apparently gone unnoticed in a well-known synchronization algorithm for nearly two decades. We use hardware transactional memory (HTM) to construct a corrected version of the algorithm. This version is significantly simpler than the original and furthermore improves on it by eliminating usage constraints and reducing space requirements. Performance of the HTM-based algorithm is competitive with the original in "normal" conditions, but it does suffer somewhat under heavy contention. We successfully apply some optimizations to help close this gap, but we also find that they are incompatible with known techniques for improving progress properties. We discuss ways in which future HTM implementations may address these issues. Finally, although our focus is on how effectively HTM can correct and simplify the algorithm, we also suggest bug fixes and workarounds that do not depend on HTM.