Cooperating sequential processes
The origin of concurrent programming
Hierarchical Backoff Locks for Nonuniform Communication Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
SNZI: scalable NonZero indicators
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
TLRW: return of the read-write lock
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Brief announcement: a partitioned ticket lock
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Lock cohorting: a general technique for designing NUMA locks
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Design, verification and applications of a new read-write lock algorithm
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Nonblocking Algorithms and Scalable Multicore Programming
Queue - Concurrency
Hi-index | 0.00 |
Non-Uniform Memory Access (NUMA) architectures are gaining importance in mainstream computing systems due to the rapid growth of multi-core multi-chip machines. Extracting the best possible performance from these new machines will require us to revisit the design of the concurrent algorithms and synchronization primitives which form the building blocks of many of today's applications. This paper revisits one such critical synchronization primitive -- the reader-writer lock. We present what is, to the best of our knowledge, the first family of reader-writer lock algorithms tailored to NUMA architectures. We present several variations which trade fairness between readers and writers for higher concurrency among readers and better back-to-back batching of writers from the same NUMA node. Our algorithms leverage the lock cohorting technique to manage synchronization between writers in a NUMA-friendly fashion, binary flags to coordinate readers and writers, and simple distributed reader counter implementations to enable NUMA-friendly concurrency among readers. The end result is a collection of surprisingly simple NUMA-aware algorithms that outperform the state-of-the-art reader-writer locks by up to a factor of 10 in our microbenchmark experiments. To evaluate our algorithms in a realistic setting we also present performance results of the kccachetest benchmark of the Kyoto-Cabinet distribution, an open-source database which makes heavy use of pthread reader-writer locks. Our locks boost the performance of kccachetest by up to 40% over the best prior alternatives.