Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Using elimination to implement scalable and lock-free FIFO queues
Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
SNZI: scalable NonZero indicators
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
Scalable producer-consumer pools based on elimination-diffraction trees
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
SALSA: scalable and low synchronization NUMA-aware algorithm for producer-consumer pools
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Hi-index | 0.00 |
Task pools have many important applications in distributed and parallel computing. Pools are typically implemented using concurrent queues, which limits their scalability. We introduce CAFÉ, Contention and Fairness Explorer, a scalable and wait-free task pool which allows users to control the trade-off between fairness and contention. The main idea behind CAFÉ is to maintain a list of TreeContainers, a novel tree-based data structure providing efficient task inserts and retrievals. TreeContainers don't guarantee FIFO ordering on task retrievals. But by varying the size of the trees, CAFÉ can provide any type of pool, from ones using large trees with low contention but less fairness, to ones using small trees with higher contention but also greater fairness. We demonstrate the scalability of TreeContainer by proving an O(log2 N) bound on the step complexity of insert operations when there are N inserts, as compared to an average of Ω(N) steps in a queue based implementation. We further prove that get operations are wait-free. Evaluations of CAFÉ show that it outperforms the Java SDK implementation of the Michael-Scott queue by a factor of 30, and is over three times faster than other state-of-the-art non-FIFO task pools.