Taking concurrency seriously (position paper)

  • Authors:
  • M. Herlihy

  • Affiliations:
  • Carnegie-Mellon Univ., Pittsburgh, PA

  • Venue:
  • OOPSLA/ECOOP '88 Proceedings of the 1988 ACM SIGPLAN workshop on Object-based concurrent programming
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

I'd like to propose a challenge to language designers interested in concurrency: how well do your favorite constructs support highly-concurrent data structures? For example, consider a real-time system consisting of a pool of sensor and actuator processes that communicate via a priority queue in shared memory. Processes execute asynchronously. When a sensor process detects a condition requiring a response, it records the condition, assigns it a priority, and places the record in the queue. Whenever an actuator process becomes idle, it dequeues the highest priority item from the queue and takes appropriate action. The conventional way to prevent concurrent queue operations from interfering is to execute each operation as a critical section: only one process at a time is allowed to access the data structure. As long as one process is executing an operation, any other needing to access the queue must wait. Although this approach is widely used, it has significant drawbacks.It is not fault-tolerant. If one process unexpectedly halts in the middle of an operation, then other processes attempting to access the queue will wait forever. Although it may sometimes be possible to detect the failure and preempt the queue, such detection takes time, it may be unreliable, and it may be impossible to restore the data structure to a consistent state.Critical sections force faster processes to wait for slower processes. Such waiting may be particularly undesirable in heterogeneous architectures, where some processors may be much faster than others. For example, a fast actuator process should not have to remain idle whenever a much slower sensor process is enqueuing a new item. Such waiting is also undesirable if each processor is dedicated to a single process, where delaying a process means idling a valuable hardware resource.Similar concerns arise even in systems not subject to real-time demands or failures. For example, process execution speeds may vary considerably if processors are multiplexed among multiple processes. If a process executing in a critical region takes a page fault, exhausts its quantum, or is swapped out, then other runnable processes needing to use that resource will be unable to make progress.An implementation of a concurrent object is wait-free if it guarantees that any process will complete any operation within a fixed number of steps, independent of the level of contention and the execution speeds of the other processes. To construct a wait-free implementation of the shared priority queue, we must break each enqueue or dequeue operation into a non-atomic sequence of atomic steps, where each atomic step is a primitive operation directly supported by the hardware, such as read, write, or fetch-and-add. To show that such an implementation is correct. It is necessary to show that (1) each operation's sequence of primitive steps has the desired effect (e.g., enqueuing or dequeuing an item) regardless of how it is interleaved with other concurrent operations, and (2) that each operation terminates within a fixed number of steps regardless of variations in speed (including arbitrary delay) of other processes.Support for wait-free synchronization requires genuinely new language constructs, not just variations on conventional approaches such as semaphores, monitors, tasks, or message-passing. I don't know what these constructs look like, but in this position paper, I would like to suggest some research directions that could lead, directly or indirectly, to progress in this area. We need to keep up with work in algorithms. To pick just one example, we now know that certain kinds of wait-free synchronization, e.g., implementing a FIFO queue from read/write registers, require randomized protocols in which processes flip coins to choose their next steps [3, 1]. The implications of such results for language design remain unclear, but suggestive. We also need to pay more attention to specification. Although transaction serializability has become widely accepted as the basic correctness condition for databases and certain distributed systems, identifying analogous properties for concurrent objects remains an active area of research [2].