Efficiently implementing a large number of LL/SC objects

  • Authors:
  • Prasad Jayanti;Srdjan Petrovic

  • Affiliations:
  • Department of Computer Science, Dartmouth College, Hanover, New Hampshire;Department of Computer Science, Dartmouth College, Hanover, New Hampshire

  • Venue:
  • OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have emerged as the most suitable synchronization instructions for the design of lock-free algorithms. However, current architectures do not support these instructions; instead, they support either CAS (e.g., UltraSPARC, Itanium) or restricted versions of LL/SC (e.g., POWER4, MIPS, Alpha). Thus, there is a gap between what algorithm designers want (namely, LL/SC) and what multiprocessors actually support (namely, CAS or RLL/RSC). To bridge this gap, a flurry of algorithms that implement LL/SC from CAS have appeared in the literature. The two most recent algorithms are due to Doherty, Herlihy, Luchangco, and Moir (2004) and Michael (2004). To implement M LL/SC objects shared by N processes, Doherty et al.'s algorithm uses only O(N + M) space, but is only non-blocking and not wait-free. Michael's algorithm, on the other hand, is wait-free, but uses O(N2 + M) space. The main drawback of his algorithm is the time complexity of the SC operation: although the expected amortized running time of SC is only O(1), the worst-case running time of SC is O(N2). The algorithm in this paper overcomes this drawback. Specifically, we design a wait-free algorithm that achieves a space complexity of O(N2 + M), while still maintaining the O(1) worst-case running time for LL and SC operations.