A scalable lock-free stack algorithm

Authors:
Danny Hendler;Nir Shavit;Lena Yerushalmi
Affiliations:
Tel-Aviv University;Tel-Aviv University and Sun Microsystems Laboratories;Tel-Aviv University
Venue:
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Year:
2004

Citing 16
Cited 36

Efficient synchronization primitives for large-scale cache-coherent multiprocessors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Adaptive backoff synchronization techniques

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Waiting algorithms for synchronization in large-scale multiprocessors

ACM Transactions on Computer Systems (TOCS)
An efficient implementation scheme of concurrent object-oriented languages on stock multicomputers

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
A methodology for implementing highly concurrent data objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Scalable concurrent counting

ACM Transactions on Computer Systems (TOCS)
Diffracting trees

ACM Transactions on Computer Systems (TOCS)
Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors

Journal of Parallel and Distributed Computing
Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors

ACM Transactions on Programming Languages and Systems (TOPLAS)
Combining funnels: a dynamic approach to software combining

Journal of Parallel and Distributed Computing
Safe memory reclamation for dynamic lock-free objects using atomic reads and writes

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Introduction to Algorithms

Introduction to Algorithms
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Non-blocking Synchronization and System Design

Non-blocking Synchronization and System Design

Using elimination to implement scalable and lock-free FIFO queues

Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Scalable synchronous queues

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip

Proceedings of the 3rd conference on Computing frontiers
Common2 extended to stacks and unbounded concurrency

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Derivation of a Scalable Lock-Free Stack Algorithm

Electronic Notes in Theoretical Computer Science (ENTCS)
Verifying Michael and Scott's lock-free queue algorithm using trace reduction

CATS '08 Proceedings of the fourteenth symposium on Computing: the Australasian theory - Volume 77
Mechanizing a Correctness Proof for a Lock-Free Concurrent Stack

FMOODS '08 Proceedings of the 10th IFIP WG 6.1 international conference on Formal Methods for Open Object-Based Distributed Systems
Proving that non-blocking algorithms don't block

Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Scalable synchronous queues

Communications of the ACM - Security in the Browser
Non-blocking Array-Based Algorithms for Stacks and Queues

ICDCN '09 Proceedings of the 10th International Conference on Distributed Computing and Networking
Scalable nonblocking concurrent objects for mission critical code

Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications
Proving linearizability via non-atomic refinement

IFM'07 Proceedings of the 6th international conference on Integrated formal methods
Flat combining and the synchronization-parallelism tradeoff

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Scalable producer-consumer pools based on elimination-diffraction trees

Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Temporal logic verification of lock-freedom

MPC'10 Proceedings of the 10th international conference on Mathematics of program construction
Mechanically verified proof obligations for linearizability

ACM Transactions on Programming Languages and Systems (TOPLAS)
Aether: a scalable approach to logging

Proceedings of the VLDB Endowment
An adaptive technique for constructing robust and high-throughput shared objects

OPODIS'10 Proceedings of the 14th international conference on Principles of distributed systems
A highly-efficient wait-free universal construction

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Verifying linearisability with potential linearisation points

FM'11 Proceedings of the 17th international conference on Formal methods
Liveness-preserving atomicity abstraction

ICALP'11 Proceedings of the 38th international conference on Automata, languages and programming - Volume Part II
Fast and scalable rendezvousing

DISC'11 Proceedings of the 25th international conference on Distributed computing
Lock-free dynamically resizable arrays

OPODIS'06 Proceedings of the 10th international conference on Principles of Distributed Systems
Competitive freshness algorithms for wait-free data objects

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Constructing shared objects that are both robust and high-throughput

DISC'06 Proceedings of the 20th international conference on Distributed Computing
Automatically proving linearizability

CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification
Revisiting the combining synchronization technique

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Simplifying linearizability proofs with reduction and abstraction

TACAS'10 Proceedings of the 16th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Scalability of write-ahead logging on multicore and multisocket hardware

The VLDB Journal — The International Journal on Very Large Data Bases
Reagents: expressing and composing fine-grained concurrency

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
SALSA: scalable and low synchronization NUMA-aware algorithm for producer-consumer pools

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Logical relations for fine-grained concurrency

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Modular verification of linearizability with non-fixed linearization points

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Unifying refinement and hoare-style reasoning in a logic for higher-order concurrency

Proceedings of the 18th ACM SIGPLAN international conference on Functional programming
Automatic linearizability proofs of concurrent objects with cooperating updates

CAV'13 Proceedings of the 25th international conference on Computer Aided Verification
Quantitative Reasoning for Proving Lock-Freedom

LICS '13 Proceedings of the 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

The literature describes two high performance concurrent stack algorithms based on combining funnels and elimination trees. Unfortunately, the funnels are linearizable but blocking, and the elimination trees are non-blocking but not linearizable. Neither is used in practice since they perform well only at exceptionally high loads. The literature also describes a simple lock-free linearizable stack algorithm that works at low loads but does not scale as the load increases. The question of designing a stack algorithm that is non-blocking, linearizable, and scales well throughout the concurrency range, has thus remained open.This paper presents such a concurrent stack algorithm. It is based on the following simple observation: that a single elimination array used as a backoff scheme for a simple lock-free stack is lock-free, linearizable, and scalable. As our empirical results show, the resulting elimination-backoff stack performs as well as the simple stack at low loads, and increasingly outperforms all other methods (lock-based and non-blocking) as concurrency increases. We believe its simplicity and scalability make it a viable practical alternative to existing constructions for implementing concurrent stacks.