Performance issues in non-blocking synchronization on shared-memory multiprocessors

Authors:
Juan Alemany;Edward W. Felten
Affiliations:
Department of Computer Science and Engineering, University of Washington, Seattle, WA;Department of Computer Science and Engineering, University of Washington, Seattle, WA
Venue:
PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
Year:
1992

Citing 7
Cited 23

Impossibility and universality results for wait-free synchronization

PODC '88 Proceedings of the seventh annual ACM Symposium on Principles of distributed computing
A methodology for implementing highly concurrent data structures

PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Wait-free synchronization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Scheduler activations: effective kernel support for the user-level management of parallelism

ACM Transactions on Computer Systems (TOCS)
Concurrent reading and writing

Communications of the ACM
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems

A methodology for implementing highly concurrent data objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Recent trends in experimental operating systems research

PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
On the space complexity of randomized synchronization

PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Transactional memory: architectural support for lock-free data structures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A method for implementing lock-free shared-data structures

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
A performance evaluation of lock-free synchronization protocols

PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Disjoint-access-parallel implementations of strong shared memory primitives

PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Software transactional memory

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Lock-free linked lists using compare-and-swap

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Wait-free made fast

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Universal operations: unary versus binary

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Real-time computing with lock-free shared objects

ACM Transactions on Computer Systems (TOCS)
On the space complexity of randomized synchronization

Journal of the ACM (JACM)
Improved implementations of binary universal operations

Journal of the ACM (JACM)
Speculative lock elision: enabling highly concurrent multithreaded execution

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Characterizing the Performance of Algorithms for Lock-Free Objects

IEEE Transactions on Computers
Relative Performance of Preemption-Safe Locking and Non-Blocking Synchronization on Multiprogrammed Shared Memory Multiprocessors

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Improving Wait-Free Algorithms for Interprocess Communication in Embedded Real-Time Systems

ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Revocable locks for non-blocking programming

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Wait-free queues with multiple enqueuers and dequeuers

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Lock-Free parallel algorithms: an experimental study

HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Programming with relaxed synchronization

Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers the implementation of non-blocking concurrent objects on shared-memory multiprocessors. Real multiprocessors have properties not present in theoretical models; these properties can be exploited to design non-blocking protocols that are more efficient in practice than those allowed by theoretical models. These new protocols rely on the operating system to take action when a thread of control is delayed during its non-blocking update. We illustrate the effectiveness of this approach by presenting two protocols that address factors hindering the performance of Herlihy's standard non-blocking protocol [Herlihy 90, Herlihy 91a]. These factors are: resources wasted by attempted non-blocking operations that fail, and the cost of data copying. We demonstrate the importance of these factors experimentally, and show how they can be reduced using protocols that rely on operating system support. To reduce the overhead of failing non-blocking operations, our first protocol maintains information about the utilization of the shared object; experiments show that this protocol performs better than the known alternatives. To reduce the cost of data copying, we introduce a second, optimistic protocol that avoids copying, except in the case when a thread of control is delayed during its attempted update.