Location-based memory fences

Authors:
Edya Ladan-Mozes;I-Ting Angelina Lee;Dmitry Vyukov
Affiliations:
MIT, Cambridge, MA, USA;MIT, Cambridge, MA, USA;OOO Google, Moscow, Russian Fed.
Venue:
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Year:
2011

Citing 14
Cited 3

MULTILISP: a language for concurrent symbolic computation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Mul-T: a high-performance parallel Lisp

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
The SPARC architecture manual (version 9)

The SPARC architecture manual (version 9)
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Thread scheduling for multiprogrammed multiprocessors

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Scheduling multithreaded computations by work stealing

Journal of the ACM (JACM)
A new solution of Dijkstra's concurrent programming problem

Communications of the ACM
Solution of a problem in concurrent programming control

Communications of the ACM
Lock reservation: Java locks can mostly do without atomic operations

OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Computer Architecture, Fourth Edition: A Quantitative Approach

Computer Architecture, Fourth Edition: A Quantitative Approach
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs

IEEE Transactions on Computers
Simple and fast biased locks

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Efficient sequential consistency using conditional fences

Proceedings of the 19th international conference on Parallel architectures and compilation techniques

SALSA: scalable and low synchronization NUMA-aware algorithm for producer-consumer pools

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Fast RMWs for TSO: semantics and implementation

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Address-aware fences

Proceedings of the 27th international ACM conference on International conference on supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional memory fences are program-counter (PC) based. That is, a memory fence enforces a serialization point in the program instruction stream --- it ensures that all memory references before the fence in the program order have taken effect before the execution continues onto instructions after the fence. Such PC-based memory fences always cause the processor to stall, even when the synchronization is unnecessary during a particular execution. We propose the concept of location-based memory fences, which aim to reduce the cost of synchronization due to the latency of memory fence execution in parallel algorithms. Unlike a PC-based memory fence, a location-based memory fence serializes the instruction stream of the executing thread T1 only when a different thread T2 attempts to read the memory location which is guarded by the location-based memory fence. In this work, we describe a hardware mechanism for location-based memory fences, prove its correctness, and evaluate its potential performance benefit. Our experimental results are based on a software simulation of the proposed location-based memory fence, and thus expected to incur higher overhead than the proposed hardware mechanism would. Nevertheless, our software experiments show that applications can benefit from using location-based memory fences, but they do not scale as well in some cases, due to the software overhead. These results suggest that a hardware support for location-based memory fences is worth considering.