Commutativity-Based Concurrency Control for Abstract Data Types
IEEE Transactions on Computers
Making asynchronous parallelism safe for the world
POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Synchronization without contention
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Scalable reader-writer synchronization for shared-memory multiprocessors
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
An introduction to parallel algorithms
An introduction to parallel algorithms
ACM Transactions on Computer Systems (TOCS)
Journal of Parallel and Distributed Computing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Combining funnels: a dynamic approach to software combining
Journal of Parallel and Distributed Computing
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Proceedings of the 21st annual international conference on Supercomputing
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
Computational Geometry: Algorithms and Applications
Computational Geometry: Algorithms and Applications
Combinable memory-block transactions
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Introduction to Algorithms, Third Edition
Introduction to Algorithms, Third Edition
Managing contention for shared resources on multicore processors
Communications of the ACM
Addressing shared resource contention in multicore processors via scheduling
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Flat combining and the synchronization-parallelism tradeoff
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Internally deterministic parallel algorithms can be fast
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Revisiting the combining synchronization technique
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Brief announcement: the problem based benchmark suite
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Parallel and I/O efficient set covering algorithms
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Greedy sequential maximal independent set and matching are parallel on average
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Hi-index | 0.00 |
Memory contention can be a serious performance bottleneck in concurrent programs on shared-memory multicore architectures. Having all threads write to a small set of shared locations, for example, can lead to orders of magnitude loss in performance relative to all threads writing to distinct locations, or even relative to a single thread doing all the writes. Shared write access, however, can be very useful in parallel algorithms, concurrent data structures, and protocols for communicating among threads. We study the "priority update" operation as a useful primitive for limiting write contention in parallel and concurrent programs. A priority update takes as arguments a memory location, a new value, and a comparison function p that enforces a partial order over values. The operation atomically compares the new value with the current value in the memory location, and writes the new value only if it has higher priority according to p. On the implementation side, we show that if implemented appropriately, priority updates greatly reduce memory contention over standard writes or other atomic operations when locations have a high degree of sharing. This is shown both experimentally and theoretically. On the application side, we describe several uses of priority updates for implementing parallel algorithms and concurrent data structures, often in a way that is deterministic, guarantees progress, and avoids serial bottlenecks. We present experiments showing that a variety of such algorithms and data structures perform well under high degrees of sharing. Given the results, we believe that the priority update operation serves as a useful parallel primitive and good programming abstraction as (1) the user largely need not worry about the degree of sharing, (2) it can be used to avoid non-determinism since, in the common case when p is a total order, priority updates commute, and (3) it has many applications to programs using shared data.