Impossibility and universality results for wait-free synchronization
PODC '88 Proceedings of the seventh annual ACM Symposium on Principles of distributed computing
A methodology for implementing highly concurrent data structures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Scheduler activations: effective kernel support for the user-level management of parallelism
ACM Transactions on Computer Systems (TOCS)
Concurrent reading and writing
Communications of the ACM
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A methodology for implementing highly concurrent data objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Recent trends in experimental operating systems research
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
On the space complexity of randomized synchronization
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A method for implementing lock-free shared-data structures
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
A performance evaluation of lock-free synchronization protocols
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Disjoint-access-parallel implementations of strong shared memory primitives
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Lock-free linked lists using compare-and-swap
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Universal operations: unary versus binary
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Real-time computing with lock-free shared objects
ACM Transactions on Computer Systems (TOCS)
On the space complexity of randomized synchronization
Journal of the ACM (JACM)
Improved implementations of binary universal operations
Journal of the ACM (JACM)
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Characterizing the Performance of Algorithms for Lock-Free Objects
IEEE Transactions on Computers
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Improving Wait-Free Algorithms for Interprocess Communication in Embedded Real-Time Systems
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Revocable locks for non-blocking programming
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Wait-free queues with multiple enqueuers and dequeuers
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Lock-Free parallel algorithms: an experimental study
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Programming with relaxed synchronization
Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability
Hi-index | 0.00 |
This paper considers the implementation of non-blocking concurrent objects on shared-memory multiprocessors. Real multiprocessors have properties not present in theoretical models; these properties can be exploited to design non-blocking protocols that are more efficient in practice than those allowed by theoretical models. These new protocols rely on the operating system to take action when a thread of control is delayed during its non-blocking update. We illustrate the effectiveness of this approach by presenting two protocols that address factors hindering the performance of Herlihy's standard non-blocking protocol [Herlihy 90, Herlihy 91a]. These factors are: resources wasted by attempted non-blocking operations that fail, and the cost of data copying. We demonstrate the importance of these factors experimentally, and show how they can be reduced using protocols that rely on operating system support. To reduce the overhead of failing non-blocking operations, our first protocol maintains information about the utilization of the shared object; experiments show that this protocol performs better than the known alternatives. To reduce the cost of data copying, we introduce a second, optimistic protocol that avoids copying, except in the case when a thread of control is delayed during its attempted update.