Everything you always wanted to know about synchronization but were afraid to ask

Authors:
Tudor David;Rachid Guerraoui;Vasileios Trigonakis
Affiliations:
EPFL, Switzerland;EPFL, Switzerland;EPFL, Switzerland
Venue:
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Year:
2013

Citing 30
Cited 0

Performance of Firefly RPC

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Sharing memory robustly in message-passing systems

PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Wait-free synchronization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Synchronization without contention

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Optimal strategies for spinning and blocking

Journal of Parallel and Distributed Computing
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Scalable queue-based spin locks with timeout

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Design Challenges of Technology Scaling

IEEE Micro
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Amdahl's Law in the Multicore Era

Computer
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Performance Studies of Commercial Workloads on a Multi-core System

IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Factored operating systems (fos): the case for a scalable operating system for multicores

ACM SIGOPS Operating Systems Review
The multikernel: a new OS architecture for scalable multicore systems

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor

IEEE Micro
Flat combining and the synchronization-parallelism tradeoff

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Corey: an operating system for many cores

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
An analysis of Linux scalability to many cores

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
The future of microprocessors

Communications of the ACM
GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Memory Performance And SPEC OpenMP scalability on quad-socket x86 64 systems

ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
A hierarchical CLH queue lock

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Lock cohorting: a general technique for designing NUMA locks

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
TM2C: a software transactional memory for many-cores

Proceedings of the 7th ACM european conference on Computer Systems
Remote core locking: migrating critical-section execution to improve the performance of multithreaded applications

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
The Art of Multiprocessor Programming, Revised Reprint

The Art of Multiprocessor Programming, Revised Reprint

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the most exhaustive study of synchronization to date. We span multiple layers, from hardware cache-coherence protocols up to high-level concurrent software. We do so on different types of architectures, from single-socket -- uniform and non-uniform -- to multi-socket -- directory and broadcast-based -- many-cores. We draw a set of observations that, roughly speaking, imply that scalability of synchronization is mainly a property of the hardware.