Using hardware transactional memory to correct and simplify and readers-writer lock algorithm

Authors:
Dave Dice;Yossi Lev;Yujie Liu;Victor Luchangco;Mark Moir
Affiliations:
Oracle Labs, Burlington, MA, USA;Oracle Labs, Burlington, MA, USA;Lehigh University, Bethlehem, PA, USA;Oracle Labs, Burlington, MA, USA;Oracle Labs, Burlington, MA, USA
Venue:
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
2013

Citing 8
Cited 1

Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
The synergy between non-blocking synchronization and operating system structure

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Speculative lock elision: enabling highly concurrent multithreaded execution

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Early experience with a commercial hardware transactional memory implementation

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Scalable reader-writer locks

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Early experience with a commercial hardware transactional memory implementation

Early experience with a commercial hardware transactional memory implementation
Simplifying concurrent algorithms by exploiting hardware transactional memory

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
STM in the small: trading generality for performance in software transactional memory

Proceedings of the 7th ACM european conference on Computer Systems

Opportunities and pitfalls of multi-core scaling using hardware transaction memory

Proceedings of the 4th Asia-Pacific Workshop on Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Designing correct synchronization algorithms is notoriously difficult, as evidenced by a bug we have identified that has apparently gone unnoticed in a well-known synchronization algorithm for nearly two decades. We use hardware transactional memory (HTM) to construct a corrected version of the algorithm. This version is significantly simpler than the original and furthermore improves on it by eliminating usage constraints and reducing space requirements. Performance of the HTM-based algorithm is competitive with the original in "normal" conditions, but it does suffer somewhat under heavy contention. We successfully apply some optimizations to help close this gap, but we also find that they are incompatible with known techniques for improving progress properties. We discuss ways in which future HTM implementations may address these issues. Finally, although our focus is on how effectively HTM can correct and simplify the algorithm, we also suggest bug fixes and workarounds that do not depend on HTM.