Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
ACM Transactions on Programming Languages and Systems (TOPLAS)
A methodology for implementing highly concurrent data objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
A completeness theorem for a class of synchronization objects
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Practical implementations of non-blocking synchronization primitives
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
A lower bound on the local time complexity of universal constructions
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
A polylog time wait-free construction for closed objects
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors
Journal of Parallel and Distributed Computing
Specifying Concurrent Program Modules
ACM Transactions on Programming Languages and Systems (TOPLAS)
f-arrays: implementation and applications
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Universal Constructions for Large Objects
WDAG '95 Proceedings of the 9th International Workshop on Distributed Algorithms
Efficient and practical constructions of LL/SC variables
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Wait-free queues with multiple enqueuers and dequeuers
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
Despite the ubiquitous need for shared FIFO queues in parallel applications and operating systems, there are no sublinear-time wait-free queue algorithms that can support more than a single enqueuer and a single dequeuer. Two independently designed algorithms—David's recent algorithm [1] and the algorithm in this paper—break this barrier. While David's algorithm is capable of supporting multiple dequeuers (but only one enqueuer), our algorithm can support multiple enqueuers (but only one dequeuer). David's algorithm achieves O(1) time complexity for both enqueue and dequeue operations, but its space complexity is infinite because of the use of infinite sized arrays. The author states that he can bound the space requirement, but only at the cost of increasing the time complexity to O(n), where n is the number of dequeuers. A significant feature of our algorithm is that its time and space complexities are both bounded and small: enqueue and dequeue operations run in $O(\lg n)$ time, and the space complexity is O(n+m), where n is the number of enqueuers and m is the actual number of items currently present in the queue. David's algorithm uses fetch&increment and swap instructions, which are both at level 2 of Herlihy's Consensus hierarchy, along with queue. Our algorithm uses the LL/SC instructions, which are universal. However, since these instructions have constant time wait-free implementation from CAS and restricted LL/SC that are widely supported on modern architectures, our algorithms can run efficiently on current machines. Thus, in applications where there are multiple producers and a single consumer (e.g., certain server queues and resource queues), our algorithm provides the best known solution to implementing a wait-free queue. Using similar ideas, we can also efficiently implement a stack that supports multiple pushers and a single popper.