Efficiently implementing a large number of LL/SC objects

Authors:
Prasad Jayanti;Srdjan Petrovic
Affiliations:
Department of Computer Science, Dartmouth College, Hanover, New Hampshire;Department of Computer Science, Dartmouth College, Hanover, New Hampshire
Venue:
OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Year:
2005

Citing 18
Cited 1

Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Wait-free synchronization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Alpha architecture reference manual

Alpha architecture reference manual
A methodology for implementing highly concurrent data objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Disjoint-access-parallel implementations of strong shared memory primitives

PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Universal constructions for multi-object operations

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Practical implementations of non-blocking synchronization primitives

PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
A polylog time wait-free construction for closed objects

PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Concurrent Reading While Writing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Concurrent reading and writing

Communications of the ACM
f-arrays: implementation and applications

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Universal Constructions for Large Objects

WDAG '95 Proceedings of the 9th International Workshop on Distributed Algorithms
Nonblocking k-compare-single-swap

Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Obstruction-Free Synchronization: Double-Ended Queues as an Example

ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Efficient and practical constructions of LL/SC variables

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Bringing practical lock-free synchronization to 64-bit applications

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
An optimal multi-writer snapshot algorithm

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Efficient Wait-Free Implementation of Multiword LL/SC Variables

ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems

Pragmatic primitives for non-blocking data structures

Proceedings of the 2013 ACM symposium on Principles of distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have emerged as the most suitable synchronization instructions for the design of lock-free algorithms. However, current architectures do not support these instructions; instead, they support either CAS (e.g., UltraSPARC, Itanium) or restricted versions of LL/SC (e.g., POWER4, MIPS, Alpha). Thus, there is a gap between what algorithm designers want (namely, LL/SC) and what multiprocessors actually support (namely, CAS or RLL/RSC). To bridge this gap, a flurry of algorithms that implement LL/SC from CAS have appeared in the literature. The two most recent algorithms are due to Doherty, Herlihy, Luchangco, and Moir (2004) and Michael (2004). To implement M LL/SC objects shared by N processes, Doherty et al.'s algorithm uses only O(N + M) space, but is only non-blocking and not wait-free. Michael's algorithm, on the other hand, is wait-free, but uses O(N2 + M) space. The main drawback of his algorithm is the time complexity of the SC operation: although the expected amortized running time of SC is only O(1), the worst-case running time of SC is O(N2). The algorithm in this paper overcomes this drawback. Specifically, we design a wait-free algorithm that achieves a space complexity of O(N2 + M), while still maintaining the O(1) worst-case running time for LL and SC operations.