Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Compiler algorithms for synchronization
IEEE Transactions on Computers
Efficient and correct execution of parallel programs that share memory
ACM Transactions on Programming Languages and Systems (TOPLAS)
Introduction to algorithms
Modern compiler implementation in Java
Modern compiler implementation in Java
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A technique for reducing synchronization overhead in large scale multiprocessors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Hiding Relaxed Memory Consistency with a Compiler
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
The Java Language Specification
The Java Language Specification
Hiding Relaxed Memory Consistency with Compilers
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Compilation Techniques for Explicitly Parallel Programs
Compilation Techniques for Explicitly Parallel Programs
Automatic implementation of programming language consistency models
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Compiler techniques for high performance sequentially consistent java programs
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
A two-phase escape analysis for parallel java programs
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Lightweight lock-free synchronization methods for multithreading
Proceedings of the 20th annual international conference on Supercomputing
CheckFence: checking consistency of concurrent data types on relaxed memory models
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Practical escape analyses: how good are they?
Proceedings of the 3rd international conference on Virtual execution environments
Effective Program Verification for Relaxed Memory Models
CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Incorporation of OpenMP memory consistency into conventional dataflow analysis
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Efficient sequential consistency using conditional fences
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
On the effectiveness of speculative and selective memory fences
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Partial-coherence abstractions for relaxed memory models
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Automatic inference of memory fences
Proceedings of the 2010 Conference on Formal Methods in Computer-Aided Design
Stability in weak memory models
CAV'11 Proceedings of the 23rd international conference on Computer aided verification
A verification-based approach to memory fence insertion in relaxed memory systems
Proceedings of the 18th international SPIN conference on Model checking software
Evaluating the impact of thread escape analysis on a memory consistency model-aware compiler
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Verification of STM on relaxed memory models
Formal Methods in System Design
Clarifying and compiling C/C++ concurrency: from C++11 to POWER
POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Bounded model checking of concurrent data types on relaxed memory models: a case study
CAV'06 Proceedings of the 18th international conference on Computer Aided Verification
Efficient computation of communicator variables for programs with unstructured parallelism
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Efficient sequential consistency via conflict ordering
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Dynamic synthesis for relaxed memory models
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Counter-Example guided fence insertion under TSO
TACAS'12 Proceedings of the 18th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Automatic inference of memory fences
ACM SIGACT News
Volition: scalable and precise sequential consistency violation detection
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.00 |
In general, the hardware memory consistency model in a multiprocessor system is not identical to the memory model at the programming language level. Consequently, the programming language memory model must be mapped onto the hardware memory model. Memory fence instructions can be inserted by the compiler where needed to accomplish this mapping. We have developed and implemented several fence insertion and optimization algorithms in our Pensieve compiler project. We present the different fence insertion optimization techniques that were used in this system to guarantee sequential consistency at the language level, and compare them using performance data. Our techniques target two hardware relaxed memory consistency models provided by SMPs based on IBM Power 3 and Intel Pentium 4. Our fence insertion optimization shows up to 17.2% and 32.7% performance improvement on average, with the IBM PowerPC and Intel Pentium 4 (Xeon) multiprocessors respectively.