Limits on the power of concurrent-write parallel machines
Information and Computation
The APRAM: incorporating asynchrony into the PRAM model
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
ACM Transactions on Programming Languages and Systems (TOPLAS)
A method for implementing lock-free shared-data structures
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
Are wait-free algorithms fast?
Journal of the ACM (JACM)
Universal constructions for multi-object operations
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
The synergy between non-blocking synchronization and operating system structure
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
SIAM Journal on Computing
Disentangling multi-object operations (extended abstract)
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
A time complexity lower bound for randomized implementations of some shared objects
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
On the space complexity of randomized synchronization
Journal of the ACM (JACM)
Information and Computation
Improved implementations of binary universal operations
Journal of the ACM (JACM)
Time and Space Lower Bounds for Nonblocking Implementations
SIAM Journal on Computing
WDAG '97 Proceedings of the 11th International Workshop on Distributed Algorithms
Nonblocking k-compare-single-swap
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Long-Lived Adaptive Collect with Applications
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Obstruction-Free Synchronization: Double-Ended Queues as an Example
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Nonblocking synchronization and system design
Nonblocking synchronization and system design
Reactive Multi-Word Synchronization for Multiprocessors
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
DCAS is not a silver bullet for nonblocking algorithm design
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
On the inherent weakness of conditional synchronization primitives
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
An Ω (n log n) lower bound on the cost of mutual exclusion
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
The balancing act of choosing nonblocking features
Communications of the ACM
The Balancing Act of Choosing Nonblocking Features
Queue - Development
Hi-index | 0.02 |
This paper presents lower bounds on the time- and space-complexity of implementations that use the k compare-and-swap (k-CAS) synchronization primitives. We prove that the use of k-CAS primitives cannot improve neither the time- nor the space-complexity of implementations of widely-used concurrent objects, such as counter, stack, queue, and collect. Surprisingly, the use of k-CAS may even increase the space complexity required by such implementations. We prove that the worst-case average number of steps performed by processes for any n-process implementation of a counter, stack or queue object is Ω(logk+1n), even if the implementation can use j-CAS for j ≤ k. This bound holds even if a k-CAS operation is allowed to read the k values of the objects it accesses and return these values to the calling process. This bound is tight. We also consider more realistic non-readingk-CAS primitives. An operation of a non-reading k-CAS primitive is only allowed to return a success/failure indication. For implementations of the collect object that use such primitives, we prove that the worst-case average number of steps performed by processes is Ω(log2n), regardless of the value of k. This implies a round complexity lower bound of Ω(log2n) for such implementations. As there is an O(log2n) round complexity implementation of collect that uses only reads and writes, these results establish that non-reading k-CAS is no stronger than read and write for collect implementation round complexity. We also prove that k-CAS does not improve the space complexity of implementing many objects (including counter, stack, queue, and single-writer snapshot). An implementation has to use at least n base objects even if k-CAS is allowed, and if all operations (other than read) swap exactly k base objects, then the space complexity must be at least k · n.