Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
The SPARC architecture manual: version 8
The SPARC architecture manual: version 8
Reasoning about parallel architectures
Reasoning about parallel architectures
The PowerPC architecture: a specification for a new family of RISC processors
The PowerPC architecture: a specification for a new family of RISC processors
IEEE Micro
Checking Cache-Coherence Protocols with TLA+
Formal Methods in System Design
Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture
IEEE Transactions on Parallel and Distributed Systems
Memory Consistency Models for Shared-Memory Multiprocessors
Memory Consistency Models for Shared-Memory Multiprocessors
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Reasoning about the ARM weakly consistent memory model
Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
Foundations of the C++ concurrency memory model
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
IBM Journal of Research and Development
The semantics of x86-CC multiprocessor machine code
Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The semantics of power and ARM multiprocessor machine code
Proceedings of the 4th workshop on Declarative aspects of multicore programming
A Better x86 Memory Model: x86-TSO
TPHOLs '09 Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics
x86-TSO: a rigorous and usable programmer's model for x86 multiprocessors
Communications of the ACM
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Litmus: running tests against hardware
TACAS'11/ETAPS'11 Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software
Understanding POWER multiprocessors
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification
Understanding POWER multiprocessors
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Litmus tests for comparing memory consistency models: how long do they need to be?
Proceedings of the 48th Design Automation Conference
Lem: a lightweight tool for heavyweight semantics
ITP'11 Proceedings of the Second international conference on Interactive theorem proving
Verifying fence elimination optimisations
SAS'11 Proceedings of the 18th international conference on Static analysis
Clarifying and compiling C/C++ concurrency: from C++11 to POWER
POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fences in weak memory models (extended version)
Formal Methods in System Design
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
What's decidable about weak memory models?
ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
Concurrent library correctness on the TSO memory model
ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
An axiomatic memory model for POWER multiprocessors
CAV'12 Proceedings of the 24th international conference on Computer Aided Verification
A formal hierarchy of weak memory models
Formal Methods in System Design
False concurrency and strange-but-true machines
CONCUR'12 Proceedings of the 23rd international conference on Concurrency Theory
Beyond expert-only parallel programming?
Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability
Library abstraction for C/C++ concurrency
POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Plan B: a buffered memory model for Java
POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Correct and efficient work-stealing for weak memory models
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Quarantining weakness: compositional reasoning under relaxed memory models
ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
Software verification for weak memory via program transformation
ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
Proving the correctness of nonblocking data structures
Communications of the ACM
Robust architectural support for transactional memory in the power architecture
Proceedings of the 40th Annual International Symposium on Computer Architecture
CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency
Journal of the ACM (JACM)
Proving the Correctness of Nonblocking Data Structures
Queue - Concurrency
Nonblocking Algorithms and Scalable Multicore Programming
Queue - Concurrency
Partial orders for efficient bounded model checking of concurrent software
CAV'13 Proceedings of the 25th international conference on Computer Aided Verification
Hi-index | 0.02 |
Exploiting today's multiprocessors requires high-performance and correct concurrent systems code (optimising compilers, language runtimes, OS kernels, etc.), which in turn requires a good understanding of the observable processor behaviour that can be relied on. Unfortunately this critical hardware/software interface is not at all clear for several current multiprocessors. In this paper we characterise the behaviour of IBM POWER multiprocessors, which have a subtle and highly relaxed memory model (ARM multiprocessors have a very similar architecture in this respect). We have conducted extensive experiments on several generations of processors: POWER G5, 5, 6, and 7. Based on these, on published details of the microarchitectures, and on discussions with IBM staff, we give an abstract-machine semantics that abstracts from most of the implementation detail but explains the behaviour of a range of subtle examples. Our semantics is explained in prose but defined in rigorous machine-processed mathematics; we also confirm that it captures the observable processor behaviour, or the architectural intent, for our examples with an executable checker. While not officially sanctioned by the vendor, we believe that this model gives a reasonable basis for reasoning about current POWER multiprocessors. Our work should bring new clarity to concurrent systems programming for these architectures, and is a necessary precondition for any analysis or verification. It should also inform the design of languages such as C and C++, where the language memory model is constrained by what can be efficiently compiled to such multiprocessors.