Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Concurrency control in a system for distributed databases (SDD-1)
ACM Transactions on Database Systems (TODS)
Issues related to MIMD shared-memory computers: the NYU ultracomputer approach
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Concurrency Control in Distributed Database Systems
ACM Computing Surveys (CSUR)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
An optimality theory of concurrency control for databases
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Efficient synchronization of multiprocessors with shared memory
ACM Transactions on Programming Languages and Systems (TOPLAS)
A model for concurrency in nested transactions systems
Journal of the ACM (JACM)
The APRAM: incorporating asynchrony into the PRAM model
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Proving sequential consistency of high-performance shared memories (extended abstract)
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
A correctness condition for high-performance multiprocessors (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Simple rational guidance for chopping up transactions
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Compile-time analysis of parallel programs that share memory
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ACM Transactions on Programming Languages and Systems (TOPLAS)
Programming models for irregular applications
ACM SIGPLAN Notices - Workshop on languages, compilers and run-time environments for distributed memory multiprocessors
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Data flow analysis for parallel programs
CSC '93 Proceedings of the 1993 ACM conference on Computer science
On testing cache-coherent shared memories
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Optimizing parallel programs with explicit synchronization
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Transaction chopping: algorithms and performance studies
ACM Transactions on Database Systems (TODS)
Verification techniques for cache coherence protocols
ACM Computing Surveys (CSUR)
Lamport clocks: verifying a directory cache-coherence protocol
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Retrospective: weak ordering—a new definition
25 years of the international symposia on Computer architecture (selected papers)
Weak ordering—a new definition
25 years of the international symposia on Computer architecture (selected papers)
Memory consistency and event ordering in scalable shared-memory multiprocessors
25 years of the international symposia on Computer architecture (selected papers)
Basic compiler algorithms for parallel programs
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Code motion for explicitly parallel programs
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Compile-time detection of race conditions in a parallel program
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Hiding Relaxed Memory Consistency with a Compiler
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Pointer analysis for structured parallel programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
IEEE Micro
A Unified Formalization of Four Shared-Memory Models
IEEE Transactions on Parallel and Distributed Systems
Access Graphs: A Model for Investigating Memory Consistency
IEEE Transactions on Parallel and Distributed Systems
A Technique for the Distributed Simulation of Parallel Computers
MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Analysis of Multithreaded Programs
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Static conflict analysis for multi-threaded object-oriented programs
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Automatic fence insertion for shared memory multiprocessing
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Hardware Controlled Prefeching in Directory-Based Cache Coherent Systems
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes
Proceedings of the 30th annual international symposium on Computer architecture
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
Consistency and event ordering in the shared regions model
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Proving refinement using transduction
Distributed Computing - Special issue: Verification of lazy caching
Compiler techniques for high performance sequentially consistent java programs
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Design Space Exploration of a Software Speculative Parallelization Scheme
IEEE Transactions on Parallel and Distributed Systems
Communication Optimizations for Fine-Grained UPC Applications
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Making Sequential Consistency Practical in Titanium
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
International Journal of Parallel Programming
Memory Model = Instruction Reordering + Store Atomicity
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 33rd annual international symposium on Computer Architecture
A two-phase escape analysis for parallel java programs
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Lightweight lock-free synchronization methods for multithreading
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Reordering constraints for pthread-style locks
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
CheckFence: checking consistency of concurrent data types on relaxed memory models
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Practical escape analyses: how good are they?
Proceedings of the 3rd international conference on Virtual execution environments
A java compiler for many memory models - extended abstract
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
WODA '07 Proceedings of the 5th International Workshop on Dynamic Analysis
Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations
International Journal of High Performance Computing and Networking
Foundations of the C++ concurrency memory model
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Effective Program Verification for Relaxed Memory Models
CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
A Framework for Proving Correctness of Adjoint Message-Passing Programs
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Memory models: a case for rethinking parallel languages and hardware
Communications of the ACM
Incorporation of OpenMP memory consistency into conventional dataflow analysis
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
DRFX: a simple and efficient memory model for concurrent programming languages
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 37th annual international symposium on Computer architecture
Efficient sequential consistency using conditional fences
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Automatic atomic region identification in shared memory SPMD programs
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
A formal approach to replica consistency in directory service
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Reasoning about the implementation of concurrency abstractions on x86-TSO
ECOOP'10 Proceedings of the 24th European conference on Object-oriented programming
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Efficient processor support for DRFx, a memory model with exceptions
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Sound and complete monitoring of sequential consistency for relaxed memory models
TACAS'11/ETAPS'11 Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software
Partial-coherence abstractions for relaxed memory models
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
A case for an SC-preserving compiler
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Safe optimisations for shared-memory concurrent programs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Automatic inference of memory fences
Proceedings of the 2010 Conference on Formal Methods in Computer-Aided Design
Deciding robustness against total store ordering
ICALP'11 Proceedings of the 38th international conference on Automata, languages and programming - Volume Part II
Stability in weak memory models
CAV'11 Proceedings of the 23rd international conference on Computer aided verification
Verifying fence elimination optimisations
SAS'11 Proceedings of the 18th international conference on Static analysis
Efficient computation of may-happen-in-parallel information for concurrent java programs
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Evaluating the impact of thread escape analysis on a memory consistency model-aware compiler
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Clarifying and compiling C/C++ concurrency: from C++11 to POWER
POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Bounded model checking of concurrent data types on relaxed memory models: a case study
CAV'06 Proceedings of the 18th international conference on Computer Aided Verification
Efficient computation of communicator variables for programs with unstructured parallelism
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification
Automatic implementation of programming language consistency models
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
GKLEE: concolic verification and test generation for GPUs
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Efficient sequential consistency via conflict ordering
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Fences in weak memory models (extended version)
Formal Methods in System Design
On a Technique for Transparently Empowering Classical Compiler Optimizations on Multithreaded Code
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Dynamic synthesis for relaxed memory models
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Automatic inference of memory fences
ACM SIGACT News
End-to-end sequential consistency
Proceedings of the 39th Annual International Symposium on Computer Architecture
Checking and enforcing robustness against TSO
ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
MEMORAX, a precise and sound tool for automatic fence insertion under TSO
TACAS'13 Proceedings of the 19th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 27th international ACM conference on International conference on supercomputing
WeeFence: toward making fences free in TSO
Proceedings of the 40th Annual International Symposium on Computer Architecture
CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency
Journal of the ACM (JACM)
Compiler testing via a theory of sound optimisations in the C11/C++11 memory model
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Hi-index | 0.02 |
In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers. A program on such machines consists of many sequential program segments, each executed by a single processor. These segments interact as they access shared variables. Access to memory is asynchronous, and memory accesses are not necessarily executed in the order they were issued. An execution is correct if it is sequentially consistent: It should seem as if all the instructions were executed sequentially, in an order obtained by interleaving the instruction streams of the processors. Sequential consistency can be enforced by delaying each access to shared memory until the previous access of the same processor has terminated. For performance reasons, however, we want to allow several accesses by the same processor to proceed concurrently. Our analysis finds a minimal set of delays that enforces sequential consistency. The analysis extends to interprocessor synchronization constraints and to code where blocks of operations have to execute atomically. We use a conflict graph similar to that used to schedule transactions in distributed databases. Our graph incorporates the order on operations given by the program text, enabling us to do without locks even when database conflict graphs would suggest that locks are necessary. Our work has implications for the design of multiprocessors; it offers new compiler optimization techniques for parallel languages that support shared variables.