Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers
Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
ACM Transactions on Programming Languages and Systems (TOPLAS)
An efficient implementation scheme of concurrent object-oriented languages on stock multicomputers
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors
Journal of Parallel and Distributed Computing
A scalable lock-free stack algorithm
Journal of Parallel and Distributed Computing
Flat combining and the synchronization-parallelism tradeoff
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Scalable producer-consumer pools based on elimination-diffraction trees
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
An adaptive technique for constructing robust and high-throughput shared objects
OPODIS'10 Proceedings of the 14th international conference on Principles of distributed systems
Hi-index | 0.00 |
Two key synchronization paradigms for the construction of scalable concurrent data-structures are software combining and elimination. Elimination-based concurrent data-structures allow operations with reverse semantics (such as push and pop stack operations) to "collide" and exchange values without having to access a central location. Software combining, on the other hand, is effective when colliding operations have identical semantics: when a pair of threads performing operations with identical semantics collide, the task of performing the combined set of operations is delegated to one of the threads and the other thread waits for its operation(s) to be performed. Applying this mechanism iteratively can reduce memory contention and increase throughput. The most highly scalable prior concurrent stack algorithm is the elimination-backoff stack [5]. The elimination-backoff stack provides high parallelism for symmetric workloads in which the numbers of push and pop operations are roughly equal, but its performance deteriorates when workloads are asymmetric. We present DECS, a novel Dynamic Elimination-Combining Stack algorithm, that scales well for all workload types. While maintaining the simplicity and low-overhead of the elimination-bakcoff stack, DECS manages to benefit from collisions of both identical- and reverse-semantics operations. Our empirical evaluation shows that DECS scales significantly better than both blocking and non-blocking best prior stack algorithms.