An adaptive technique for constructing robust and high-throughput shared objects

Authors:
Danny Hendler;Shay Kutten;Erez Michalak
Affiliations:
Department of Computer-Science, Ben-Gurion University;Department of Industrial Engineering and Management, Technion;Department of Industrial Engineering and Management, Technion
Venue:
OPODIS'10 Proceedings of the 14th international conference on Principles of distributed systems
Year:
2010

Citing 15
Cited 1

On the minimal synchronism needed for distributed consensus

Journal of the ACM (JACM)
Consensus in the presence of partial synchrony

Journal of the ACM (JACM)
Efficient synchronization primitives for large-scale cache-coherent multiprocessors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Leases: an efficient fault-tolerant mechanism for distributed file cache consistency

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Wait-free synchronization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Contention in shared memory algorithms

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
A polylog time wait-free construction for closed objects

PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
The NYU ultracomputer—designing a MIMD, shared-memory parallel machine

25 years of the international symposia on Computer architecture (selected papers)
Two-handed emulation: how to build non-blocking implementations of complex data-structures using DCAS

Proceedings of the twenty-first annual symposium on Principles of distributed computing
A scalable lock-free stack algorithm

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Using elimination to implement scalable and lock-free FIFO queues

Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
SNZI: scalable NonZero indicators

Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Constructing shared objects that are both robust and high-throughput

DISC'06 Proceedings of the 20th international conference on Distributed Computing
Self-tuning reactive distributed trees for counting and balancing

OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems

A dynamic elimination-combining stack algorithm

OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Shared counters are the key to solving a variety of coordination problems on multiprocessor machines, such as barrier synchronization and index distribution. It is desired that they, like shared objects in general, be robust, linearizable and scalable. We present the first linearizable and wait-free shared counter algorithm that achieves high throughput without a-priori knowledge about the system's level of asynchrony. Our algorithm can be easily adapted to any other combinable objects as well, such as stacks and queues. In particular, in an N-process execution E, our algorithm achieves high throughput of Ω(N/φE2 log2 φElogN), where φE is E's level of asynchrony. Moreover, our algorithm stands any constant number of faults. If E contains a constant number of faults, then our algorithm still achieves high throughput of Ω(N/φ′E2 log2 φ′E logN), where φ′E bounds the relative speeds of any two processes, at a time that both of them participated in E and none of them failed. Our algorithm can be viewed as an adaptive version of the Bounded-Wait-Combining (BWC) prior art algorithm. BWC receives as an input an argument φ as a (supposed) upper bound of φE, and achieves optimal throughput if φ = φE. However, if the given φ happens to be lower than the actual φE, or much greater than φE, then the throughput of BWC degraded significantly. Moreover, whereas BWC is only lock-free, our algorithm is more robust, since it is wait-free. To achieve high throughput and wait-freedom, we present a method that guarantees (for some common kind of procedures) the procedure's successful termination in a bounded time, regardless of shared memory contention. This method may prove useful by itself, for other problems