Lock cohorting: a general technique for designing NUMA locks

Authors:
David Dice;Virendra J. Marathe;Nir Shavit
Affiliations:
Oracle Labs, Burlington, MA, USA;Oracle Labs, Burlington, MA, USA;MIT, Cambridge, MA, USA
Venue:
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Year:
2012

Citing 10
Cited 7

Adaptive backoff synchronization techniques

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Scalable queue-based spin locks with timeout

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Mostly lock-free malloc

Proceedings of the 3rd international symposium on Memory management
Non-blocking timeout in scalable queue-based spin locks

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Hierarchical Backoff Locks for Nonuniform Communication Architectures

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
The Art of Multiprocessor Programming

The Art of Multiprocessor Programming
Flat combining and the synchronization-parallelism tradeoff

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Flat-combining NUMA locks

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
A hierarchical CLH queue lock

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing

Fast concurrent queues for x86 processors

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
NUMA-aware reader-writer locks

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Nonblocking algorithms and scalable multicore programming

Communications of the ACM
Scalable statistics counters

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Nonblocking Algorithms and Scalable Multicore Programming

Queue - Concurrency
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Everything you always wanted to know about synchronization but were afraid to ask

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Quantified Score

Hi-index	0.02

Visualization

Abstract

Multicore machines are quickly shifting to NUMA and CC-NUMA architectures, making scalable NUMA-aware locking algorithms, ones that take into account the machines' non-uniform memory and caching hierarchy, ever more important. This paper presents lock cohorting, a general new technique for designing NUMA-aware locks that is as simple as it is powerful. Lock cohorting allows one to transform any spin-lock algorithm, with minimal non-intrusive changes, into scalable NUMA-aware spin-locks. Our new cohorting technique allows us to easily create NUMA-aware versions of the TATAS-Backoff, CLH, MCS, and ticket locks, to name a few. Moreover, it allows us to derive a CLH-based cohort abortable lock, the first NUMA-aware queue lock to support abortability. We empirically compared the performance of cohort locks with prior NUMA-aware and classic NUMA-oblivious locks on a synthetic micro-benchmark, a real world key-value store application memcached, as well as the libc memory allocator. Our results demonstrate that cohort locks perform as well or better than known locks when the load is low and significantly out-perform them as the load increases.