Pointer analysis for structured parallel programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Analysis of Multithreaded Programs
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
A performance analysis of the Berkeley UPC compiler
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Automatic fence insertion for shared memory multiprocessing
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Compiler techniques for high performance sequentially consistent java programs
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Making Sequential Consistency Practical in Titanium
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A java compiler for many memory models - extended abstract
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Proceedings of the 37th annual international symposium on Computer architecture
The impact of memory models on software reliability in multiprocessors
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Automatic implementation of programming language consistency models
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
WeeFence: toward making fences free in TSO
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
We present a compiler technique, which is based on Shasha and Snir's delay set analysis, to hide the underlying relaxed memory consistency model for an optimizing compiler for explicitly parallel programs. The compiler presents programmers with a sequentially consistent view of the underlying machine irrespective of whether it follows a sequentially consistent model or a relaxed model. To hide the underlying relaxed memory consistency model and to guarantee sequential consistency, our algorithm inserts fence instructions by identifying memory-barrier nodes. We r educe the number of fence instructions by exploiting the ordering constraints of the underlying memory consistency model and the property of fence and synchronization operations. We introduce dominators with respect to a node in a control flow graph to identify memory-barrier nodes. We also show that minimizing the number of memory-barrier nodes by using dominators with respect to a node is NP-hard.