Multiprocessors Should Support Simple Memory-Consistency Models

Authors:
Mark D. Hill
Affiliations:
-
Venue:
Computer
Year:
1998

Citing 11
Cited 51

Comparative evaluation of latency reducing and tolerating techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
Designing memory consistency models for shared-memory multiprocessors

Designing memory consistency models for shared-memory multiprocessors
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An integrated compilation and performance analysis environment for data parallel programs

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Memory consistency models for shared-memory multiprocessors

Memory consistency models for shared-memory multiprocessors
Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency models

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
The interaction of software prefetching with ILP processors in shared-memory systems

Proceedings of the 24th annual international symposium on Computer architecture
Cost-Effective Parallel Computing

Computer
Shared Memory Consistency Models: A Tutorial

Computer
The Future of Microprocessors

IEEE Micro

Retrospective: weak ordering—a new definition

25 years of the international symposia on Computer architecture (selected papers)
Hardware Support for Flexible Distributed Shared Memory

IEEE Transactions on Computers
Commit-reconcile & fences (CRF): a new memory model for architects and compiler writers

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Is SC + ILP = RC?

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Basic compiler algorithms for parallel programs

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatable verification of sequential consistency

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Modeling weakly consistent memories with locks

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Hiding Relaxed Memory Consistency with a Compiler

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Speculative lock elision: enabling highly concurrent multithreaded execution

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
An Application-Driven Study of Multicast Communication for Write Invalidation

The Journal of Supercomputing
Delta Coherence Protocols

IEEE Concurrency
Speculative Sequential Consistency with Little Custom Storage

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
A Specification and Verification Framework for Developing Weak Shared Memory Consistency Protocols

FMCAD '02 Proceedings of the 4th International Conference on Formal Methods in Computer-Aided Design
Analysis of Multithreaded Programs

SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Verifying Sequential Consistency on Shared-Memory Multiprocessor Systems

CAV '99 Proceedings of the 11th International Conference on Computer Aided Verification
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture

IEEE Transactions on Parallel and Distributed Systems
The Thread-Based Protocol Engines for CC-NUMA Multiprocessors

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
The NUMAchine Multiprocessor

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Software for multiprocessor networks on chip

Networks on chip
TSOtool: A Program for Verifying Memory Systems Using the Memory Consistency Model

Proceedings of the 31st annual international symposium on Computer architecture
An Architecture for High-Performance Scalable Shared-Memory Multiprocessors Exploiting On-Chip Integration

IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Transparent information dissemination

Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Compiler techniques for high performance sequentially consistent java programs

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Dynamic Verification of Sequential Consistency

Proceedings of the 32nd annual international symposium on Computer Architecture
An efficient cache design for scalable glueless shared-memory multiprocessors

Proceedings of the 3rd conference on Computing frontiers
Memory Model = Instruction Reordering + Store Atomicity

Proceedings of the 33rd annual international symposium on Computer Architecture
Scalability issues in urban traffic systems

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Cache coherence tradeoffs in shared-memory MPSoCs

ACM Transactions on Embedded Computing Systems (TECS)
Lightweight lock-free synchronization methods for multithreading

Proceedings of the 20th annual international conference on Supercomputing
Mechanisms for store-wait-free multiprocessors

Proceedings of the 34th annual international symposium on Computer architecture
BulkSC: bulk enforcement of sequential consistency

Proceedings of the 34th annual international symposium on Computer architecture
Store Atomicity for Transactional Memory

Electronic Notes in Theoretical Computer Science (ENTCS)
The revolution inside the box

Communications of the ACM - Web science
A consistency architecture for hierarchical shared caches

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Effective Program Verification for Relaxed Memory Models

CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
Two proposals for the inclusion of directory information in the last-level private caches of glueless shared-memory multiprocessors

Journal of Parallel and Distributed Computing
Implementation and Use of Transactional Memory with Dynamic Separation

CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
InvisiFence: performance-transparent memory ordering in conventional multiprocessors

Proceedings of the 36th annual international symposium on Computer architecture
A disruptive computer design idea: architectures with repeatable timing

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A case for an SC-preserving compiler

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Evaluating the impact of thread escape analysis on a memory consistency model-aware compiler

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A novel lightweight directory architecture for scalable shared-memory multiprocessors

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Automatic implementation of programming language consistency models

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Efficient sequential consistency via conflict ordering

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
End-to-end sequential consistency

Proceedings of the 39th Annual International Symposium on Computer Architecture
Implicit transactional memory in kilo-instruction multiprocessors

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Exploring memory consistency for massively-threaded throughput-oriented processors

Proceedings of the 40th Annual International Symposium on Computer Architecture
Making the java memory model safe

ACM Transactions on Programming Languages and Systems (TOPLAS)

Quantified Score

Hi-index	4.10

Visualization

Abstract

In the future, many computers will contain multiple processors, in part because the marginal cost of adding a few additional processors is so low that only minimal performance gain is needed to make the additional processors cost-effective. Intel, for example, now makes cards containing four Pentium Pro processors that can easily be incorporated into a system. Multiple-processor cards like Intel's will help multiprocessing spread from servers to the desktop. But how will these multiprocessors be programmed? The evolution of the programming model is already under way. One important function of the programming model is to describe how memory operates. For a multiprocessor, a reasonable model is sequential consistency (SC), which makes a multiprocessor behave like a multitasking uni-processor. Nevertheless, many commercial multiprocessors support more relaxed memory models. The author argues that multiprocessors should support SC because--with speculative execution-- relaxed models do not provide sufficient additional performance to justify exposing their complexity to the authors of low-level software.