Performance of memory reclamation for lockless synchronization

Authors:
Thomas E. Hart;Paul E. McKenney;Angela Demke Brown;Jonathan Walpole
Affiliations:
Department of Computer Science, University of Toronto, Toronto, Ont., Canada M5S 2E4;IBM Linux Technology Center, IBM Beaverton, Beaverton, OR 97006, USA;Department of Computer Science, University of Toronto, Toronto, Ont., Canada M5S 2E4;Department of Computer Science, Portland State University, Portland, OR 97207-0751, USA
Venue:
Journal of Parallel and Distributed Computing
Year:
2007

Citing 28
Cited 17

The fuzzy barrier: a mechanism for high speed synchronization of processors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Wait-free synchronization

ACM Transactions on Programming Languages and Systems (TOPLAS)
A methodology for implementing highly concurrent data objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Lock-free linked lists using compare-and-swap

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
The synergy between non-blocking synchronization and operating system structure

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Scheduler-conscious synchronization

ACM Transactions on Computer Systems (TOCS)
Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Evaluating synchronization on shared address space multiprocessors: methodology and performance

SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Hoard: a scalable memory allocator for multithreaded applications

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Safe memory reclamation for dynamic lock-free objects using atomic reads and writes

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Shared Memory Consistency Models: A Tutorial

Computer
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
A Pragmatic Implementation of Non-blocking Linked-Lists

DISC '01 Proceedings of the 15th International Conference on Distributed Computing
The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized, Lock-Free Data Structures

DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources

Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Lock-free reference counting

Distributed Computing - Special issue: Selected papers from PODC '01
Obstruction-Free Synchronization: Double-Ended Queues as an Example

ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Correction of a Memory Management Method for Lock-Free Data Structures

Correction of a Memory Management Method for Lock-Free Data Structures
Nonblocking synchronization and system design

Nonblocking synchronization and system design
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects

IEEE Transactions on Parallel and Distributed Systems
Scalable lock-free dynamic memory allocation

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
An almost non-blocking stack

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Exploiting deferred destruction: an analysis of read-copy-update techniques in operating system kernels

Exploiting deferred destruction: an analysis of read-copy-update techniques in operating system kernels
Wait-Free Reference Counting and Memory Management

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Efficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting

ISPAN '05 Proceedings of the 8th International Symposium on Parallel Architectures,Algorithms and Networks
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs

IEEE Transactions on Computers
Lock-free and practical doubly linked list-based deques using single-word compare-and-swap

OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems

Why the grass may not be greener on the other side: a comparison of locking vs. transactional memory

Proceedings of the 4th workshop on Programming languages and operating systems
Introducing technology into the Linux kernel: a case study

ACM SIGOPS Operating Systems Review - Research and developments in the Linux kernel
The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with Linux

IBM Systems Journal
The RDF-3X engine for scalable management of RDF data

The VLDB Journal — The International Journal on Very Large Data Bases
Why the grass may not be greener on the other side: a comparison of locking vs. transactional memory

ACM SIGOPS Operating Systems Review
Scalable concurrent hash tables via relativistic programming

ACM SIGOPS Operating Systems Review
Parallel implementations of Brunotte's algorithm

Journal of Parallel and Distributed Computing
Resizable, scalable, concurrent hash tables via relativistic programming

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Beyond expert-only parallel programming?

Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability
Nonblocking algorithms and scalable multicore programming

Communications of the ACM
Structured deferral: synchronization via procrastination

Communications of the ACM
Drop the anchor: lightweight memory management for non-blocking data structures

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Structured Deferral: Synchronization via Procrastination

Queue - Concurrency
Nonblocking Algorithms and Scalable Multicore Programming

Queue - Concurrency
On the scalability of the Erlang term storage

Proceedings of the twelfth ACM SIGPLAN workshop on Erlang
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Speedy transactions in multicore in-memory databases

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Quantified Score

Hi-index	0.03

Visualization

Abstract

Achieving high performance for concurrent applications on modern multiprocessors remains challenging. Many programmers avoid locking to improve performance, while others replace locks with non-blocking synchronization to protect against deadlock, priority inversion, and convoying. In both cases, dynamic data structures that avoid locking require a memory reclamation scheme that reclaims elements once they are no longer in use. The performance of existing memory reclamation schemes has not been thoroughly evaluated. We conduct the first fair and comprehensive comparison of three recent schemes-quiescent-state-based reclamation, epoch-based reclamation, and hazard-pointer-based reclamation-using a flexible microbenchmark. Our results show that there is no globally optimal scheme. When evaluating lockless synchronization, programmers and algorithm designers should thus carefully consider the data structure, the workload, and the execution environment, each of which can dramatically affect the memory reclamation performance. We discuss the consequences of our results for programmers and algorithm designers. Finally, we describe the use of one scheme, quiescent-state-based reclamation, in the context of an OS kernel-an execution environment which is well suited to this scheme.