Understanding POWER multiprocessors

Authors:
Susmit Sarkar;Peter Sewell;Jade Alglave;Luc Maranget;Derek Williams
Affiliations:
University of Cambridge, Cambridge, United Kingdom;University of Cambridge, Cambridge, United Kingdom;Oxford University, Oxford, United Kingdom;INRIA, Rocquencourt, France;IBM Austin, Austin, TX, USA
Venue:
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Year:
2011

Citing 22
Cited 24

Memory access buffering in multiprocessors

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
The SPARC architecture manual: version 8

The SPARC architecture manual: version 8
Reasoning about parallel architectures

Reasoning about parallel architectures
The PowerPC architecture: a specification for a new family of RISC processors

The PowerPC architecture: a specification for a new family of RISC processors
Shared Memory Consistency Models: A Tutorial

Computer
Storage in the PowerPC

IEEE Micro
Checking Cache-Coherence Protocols with TLA+

Formal Methods in System Design
Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture

IEEE Transactions on Parallel and Distributed Systems
Memory Consistency Models for Shared-Memory Multiprocessors

Memory Consistency Models for Shared-Memory Multiprocessors
POWER5 System microarchitecture

IBM Journal of Research and Development - POWER5 and packaging
Reasoning about the ARM weakly consistent memory model

Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
Foundations of the C++ concurrency memory model

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
IBM POWER6 microarchitecture

IBM Journal of Research and Development
The semantics of x86-CC multiprocessor machine code

Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The semantics of power and ARM multiprocessor machine code

Proceedings of the 4th workshop on Declarative aspects of multicore programming
A Better x86 Memory Model: x86-TSO

TPHOLs '09 Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics
x86-TSO: a rigorous and usable programmer's model for x86 multiprocessors

Communications of the ACM
Power7: IBM's Next-Generation Server Processor

IEEE Micro
Mathematizing C++ concurrency

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Litmus: running tests against hardware

TACAS'11/ETAPS'11 Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software
Understanding POWER multiprocessors

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Fences in weak memory models

CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification

Understanding POWER multiprocessors

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Litmus tests for comparing memory consistency models: how long do they need to be?

Proceedings of the 48th Design Automation Conference
Lem: a lightweight tool for heavyweight semantics

ITP'11 Proceedings of the Second international conference on Interactive theorem proving
Verifying fence elimination optimisations

SAS'11 Proceedings of the 18th international conference on Static analysis
Clarifying and compiling C/C++ concurrency: from C++11 to POWER

POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fences in weak memory models (extended version)

Formal Methods in System Design
Synchronising C/C++ and POWER

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
What's decidable about weak memory models?

ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
Concurrent library correctness on the TSO memory model

ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
An axiomatic memory model for POWER multiprocessors

CAV'12 Proceedings of the 24th international conference on Computer Aided Verification
A formal hierarchy of weak memory models

Formal Methods in System Design
False concurrency and strange-but-true machines

CONCUR'12 Proceedings of the 23rd international conference on Concurrency Theory
Beyond expert-only parallel programming?

Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability
Library abstraction for C/C++ concurrency

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Plan B: a buffered memory model for Java

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Correct and efficient work-stealing for weak memory models

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Quarantining weakness: compositional reasoning under relaxed memory models

ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
Software verification for weak memory via program transformation

ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
Proving the correctness of nonblocking data structures

Communications of the ACM
Robust architectural support for transactional memory in the power architecture

Proceedings of the 40th Annual International Symposium on Computer Architecture
CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency

Journal of the ACM (JACM)
Proving the Correctness of Nonblocking Data Structures

Queue - Concurrency
Nonblocking Algorithms and Scalable Multicore Programming

Queue - Concurrency
Partial orders for efficient bounded model checking of concurrent software

CAV'13 Proceedings of the 25th international conference on Computer Aided Verification

Quantified Score

Hi-index	0.02

Visualization

Abstract

Exploiting today's multiprocessors requires high-performance and correct concurrent systems code (optimising compilers, language runtimes, OS kernels, etc.), which in turn requires a good understanding of the observable processor behaviour that can be relied on. Unfortunately this critical hardware/software interface is not at all clear for several current multiprocessors. In this paper we characterise the behaviour of IBM POWER multiprocessors, which have a subtle and highly relaxed memory model (ARM multiprocessors have a very similar architecture in this respect). We have conducted extensive experiments on several generations of processors: POWER G5, 5, 6, and 7. Based on these, on published details of the microarchitectures, and on discussions with IBM staff, we give an abstract-machine semantics that abstracts from most of the implementation detail but explains the behaviour of a range of subtle examples. Our semantics is explained in prose but defined in rigorous machine-processed mathematics; we also confirm that it captures the observable processor behaviour, or the architectural intent, for our examples with an executable checker. While not officially sanctioned by the vendor, we believe that this model gives a reasonable basis for reasoning about current POWER multiprocessors. Our work should bring new clarity to concurrent systems programming for these architectures, and is a necessary precondition for any analysis or verification. It should also inform the design of languages such as C and C++, where the language memory model is constrained by what can be efficiently compiled to such multiprocessors.