Adaptive backoff synchronization techniques
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Scalable queue-based spin locks with timeout
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Proceedings of the 3rd international symposium on Memory management
Non-blocking timeout in scalable queue-based spin locks
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Hierarchical Backoff Locks for Nonuniform Communication Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
Flat combining and the synchronization-parallelism tradeoff
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Fast concurrent queues for x86 processors
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
NUMA-aware reader-writer locks
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Nonblocking algorithms and scalable multicore programming
Communications of the ACM
Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Nonblocking Algorithms and Scalable Multicore Programming
Queue - Concurrency
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Everything you always wanted to know about synchronization but were afraid to ask
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.02 |
Multicore machines are quickly shifting to NUMA and CC-NUMA architectures, making scalable NUMA-aware locking algorithms, ones that take into account the machines' non-uniform memory and caching hierarchy, ever more important. This paper presents lock cohorting, a general new technique for designing NUMA-aware locks that is as simple as it is powerful. Lock cohorting allows one to transform any spin-lock algorithm, with minimal non-intrusive changes, into scalable NUMA-aware spin-locks. Our new cohorting technique allows us to easily create NUMA-aware versions of the TATAS-Backoff, CLH, MCS, and ticket locks, to name a few. Moreover, it allows us to derive a CLH-based cohort abortable lock, the first NUMA-aware queue lock to support abortability. We empirically compared the performance of cohort locks with prior NUMA-aware and classic NUMA-oblivious locks on a synthetic micro-benchmark, a real world key-value store application memcached, as well as the libc memory allocator. Our results demonstrate that cohort locks perform as well or better than known locks when the load is low and significantly out-perform them as the load increases.