Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Performance evaluation of memory consistency models for shared-memory multiprocessors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor
IEEE Transactions on Computers
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Removing unnecessary synchronization in Java
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Compositional pointer and escape analysis for Java programs
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Java Language Specification, Second Edition: The Java Series
Java Language Specification, Second Edition: The Java Series
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
ICC++-AC++ Dialect for High Performance Parallel Computing
ISOTAS '96 Proceedings of the Second JSSST International Symposium on Object Technologies for Advanced Software
Static Analyses for Eliminating Unnecessary Synchronization from Java Programs
SAS '99 Proceedings of the 6th International Symposium on Static Analysis
Automatic fence insertion for shared memory multiprocessing
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Design Issues and Tradeoffs for Write Buffers
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
CheckFence: checking consistency of concurrent data types on relaxed memory models
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
Hi-index | 0.00 |
Memory fences inhibit the reordering of memory accesses in modern microprocessors; fences are useful to implement synchronization and strong shared memory semantics in multi-threaded programs. A naive implementation of memory fences can result in a significant performance penalty for processors with deep pipelines supporting multiple concurrent memory accesses. The paper compares three techniques to reduce the impact of memory fences: (1) Read-speculation allows reads that follow a fence to be issued while the fence is being processed; (2) Write-ahead additionally allows writes following a fence to proceed early; (3) Selective fences distinguish between memory accesses to thread-local and shared memory and enforce ordering only among accesses to shared memory. We evaluate and compare the effectiveness of these techniques with a simulator derived from the Pentium 4 architecture. We report data for a storage model that uses memory fences to enforce the memory semantics at monitor boundaries.