An efficient unbounded lock-free queue for multi-core systems

Authors:
Marco Aldinucci;Marco Danelutto;Peter Kilpatrick;Massimiliano Meneghin;Massimo Torquati
Affiliations:
Computer Science Department, University of Torino, Italy;Computer Science Department, University of Pisa, Italy;Computer Science Department, Queen's University Belfast, UK;IBM Dublin Research Lab, Ireland;Computer Science Department, University of Pisa, Italy
Venue:
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Year:
2012

Citing 15
Cited 1

Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors

Journal of Parallel and Distributed Computing
Concurrent reading and writing

Communications of the ACM
A simple, fast and scalable non-blocking concurrent FIFO queue for shared memory multiprocessor systems

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Work dealing

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Shared Memory Consistency Models: A Tutorial

Computer
A Nonblocking Algorithm for Shared Queues Using Compare-and-Swap

IEEE Transactions on Computers
StreamIt: A Language for Streaming Applications

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Critical Sections and Producer/Consumer Queues in Weak Memory Systems

ISPAN '97 Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects

IEEE Transactions on Parallel and Distributed Systems
Using elimination to implement scalable and lock-free FIFO queues

Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs

IEEE Transactions on Computers
FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Intel threading building blocks

Intel threading building blocks
Porting decision tree algorithms to multicore using fastflow

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Toward high-throughput algorithms on many-core architectures

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers

Targeting distributed systems in fastflow

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of efficient synchronization mechanisms is crucial for implementing fine grained parallel programs on modern shared cache multi-core architectures. In this paper we study this problem by considering Single-Producer/Single-Consumer (SPSC) coordination using unbounded queues. A novel unbounded SPSC algorithm capable of reducing the row synchronization latency and speeding up Producer-Consumer coordination is presented. The algorithm has been extensively tested on a shared-cache multi-core platform and a sketch proof of correctness is presented. The queues proposed have been used as basic building blocks to implement the FastFlow parallel framework, which has been demonstrated to offer very good performance for fine-grain parallel applications.