On the minimal synchronism needed for distributed consensus
Journal of the ACM (JACM)
ACM Transactions on Programming Languages and Systems (TOPLAS)
Randomized wait-free concurrent objects (extended abstract)
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Generalized FLP impossibility result for t-resilient asynchronous computations
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
Concurrent Reading While Writing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Concurrent reading and writing
Communications of the ACM
Evaluating the performance of non-blocking synchronization on shared-memory multiprocessors
Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
WOSP '02 Proceedings of the 3rd international workshop on Software and performance
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Managing Concurrent Access for Shared Memory Active Messages
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Distributed Computing: Fundamentals, Simulations and Advanced Topics
Distributed Computing: Fundamentals, Simulations and Advanced Topics
On the importance of having an identity or, is consensus really universal?
Distributed Computing - Special issue: DISC 04
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
IEEE Transactions on Computers
Wait-free programming for general purpose computations on graphics processors
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Hi-index | 0.00 |
This paper investigates the synchronization power of coalesced memory accesses, a family of memory access mechanisms introduced in recent large multicore architectures like the CUDA graphics processors. We first design three memory access models to capture the fundamental features of the new memory access mechanisms. Subsequently, we prove the exact synchronization power of these models in terms of their consensus numbers. These tight results show that the coalesced memory access mechanisms can facilitate strong synchronization between the threads of multicore processors, without the need of synchronization primitives other than reads and writes. Moreover, based on the intrinsic features of recent GPU architectures, we construct strong synchronization objects like wait-free and t-resilient read-modify-write objects for a general model of recent GPU architectures without strong hardware synchronization primitives like test-and-set and compare-and-swap. Accesses to the wait-free objects have time complexity O(N), where N is the number of processes. Our result demonstrates that it is possible to construct waitfree synchronization mechanisms for GPUs without the need of strong synchronization primitives in hardware and that wait-free programming is possible for GPUs.