On the effectiveness of speculative and selective memory fences

Authors:
Oliver Trachsel;Christoph von Praun;Thomas R. Gross
Affiliations:
Department of Computer Science, ETH Zurich, Zurich, Switzerland;Department of Computer Science, ETH Zurich, Zurich, Switzerland;Department of Computer Science, ETH Zurich, Zurich, Switzerland
Venue:
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Year:
2006

Citing 15
Cited 2

Memory access buffering in multiprocessors

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Performance evaluation of memory consistency models for shared-memory multiprocessors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor

IEEE Transactions on Computers
Data speculation support for a chip multiprocessor

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Is SC + ILP = RC?

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Escape analysis for Java

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Removing unnecessary synchronization in Java

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Compositional pointer and escape analysis for Java programs

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Java Language Specification, Second Edition: The Java Series

Java Language Specification, Second Edition: The Java Series
Speculative lock elision: enabling highly concurrent multithreaded execution

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
ICC++-AC++ Dialect for High Performance Parallel Computing

ISOTAS '96 Proceedings of the Second JSSST International Symposium on Object Technologies for Advanced Software
Static Analyses for Eliminating Unnecessary Synchronization from Java Programs

SAS '99 Proceedings of the 6th International Symposium on Static Analysis
Automatic fence insertion for shared memory multiprocessing

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Design Issues and Tradeoffs for Write Buffers

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
The Java memory model

Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages

CheckFence: checking consistency of concurrent data types on relaxed memory models

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
InvisiFence: performance-transparent memory ordering in conventional multiprocessors

Proceedings of the 36th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Memory fences inhibit the reordering of memory accesses in modern microprocessors; fences are useful to implement synchronization and strong shared memory semantics in multi-threaded programs. A naive implementation of memory fences can result in a significant performance penalty for processors with deep pipelines supporting multiple concurrent memory accesses. The paper compares three techniques to reduce the impact of memory fences: (1) Read-speculation allows reads that follow a fence to be issued while the fence is being processed; (2) Write-ahead additionally allows writes following a fence to proceed early; (3) Selective fences distinguish between memory accesses to thread-local and shared memory and enforce ordering only among accesses to shared memory. We evaluate and compare the effectiveness of these techniques with a simulator derived from the Pentium 4 architecture. We report data for a storage model that uses memory fences to enforce the memory semantics at monitor boundaries.