The shared regions approach to software cache coherence on multiprocessors

Authors:
Harjinder S. Sandhu;Benjamin Gamsa;Songnian Zhou
Affiliations:
Univ. of Toronto, Toronto, Ont., Canada;Univ. of Toronto, Toronto, Ont., Canada;Univ. of Toronto, Toronto, Ont., Canada
Venue:
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1993

Citing 14
Cited 20

Evaluating the performance of software cache coherence

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Multi-level shared caching techniques for scalability in VMP-M/C

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Compiler-Directed Cache Management in Multiprocessors

Computer
Hector: A Hierarchically Structured Shared-Memory Multiprocessor

Computer - Special issue on experimental research in computer architecture
LimitLESS directories: A scalable cache coherence scheme

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Coarse-grain parallel programming in Jade

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Comparison of hardware and software cache coherence schemes

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Distributed shared memory with versioned objects

OOPSLA '92 conference proceedings on Object-oriented programming systems, languages, and applications
Cooperative shared memory: software and hardware for scalable multiprocessor

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Mean-Value Analysis of Closed Multichain Queuing Networks

Journal of the ACM (JACM)
Weak ordering—a new definition

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The NYU Ultracomputer—designing a MIMD, shared-memory parallel machine (Extended Abstract)

ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture

Exploiting cache affinity in software cache coherence

ICS '94 Proceedings of the 8th international conference on Supercomputing
Performance evaluation of hybrid hardware and software distributed shared memory protocols

ICS '94 Proceedings of the 8th international conference on Supercomputing
A comprehensive bibliography of distributed shared memory

ACM SIGOPS Operating Systems Review
An analytic study of dynamic hardware and software cache coherence strategies

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
CRL: high-performance all-software distributed shared memory

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
MGS: a multigrain shared memory system

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Ace: linguistic mechanisms for customizable protocols

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
VM-based shared memory on low-latency, remote-memory-access networks

Proceedings of the 24th annual international symposium on Computer architecture
The design, implementation, and evaluation of Jade

ACM Transactions on Programming Languages and Systems (TOPLAS)
Object views: language support for intelligent object caching in parallel and distributed computations

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Compiler Support for Array Distribution onNUMA Shared Memory Multiprocessors

The Journal of Supercomputing
Implementing Scoped Behavior for Flexible Distributed Data Sharing

IEEE Concurrency
Aurora: Scoped Behavior for Per-Context Optimized Distributed Data Sharing

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Multiple-writer entry consistency

Cluster computing
Exploiting high-level coherence information to optimize distributed shared state

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Consistency and event ordering in the shared regions model

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Shared memory computing on clusters with symmetric multiprocessors and system area networks

ACM Transactions on Computer Systems (TOCS)
Implementing optimized distributed data sharing using scoped behaviour and a class library

COOTS'97 Proceedings of the 3rd conference on USENIX Conference on Object-Oriented Technologies (COOTS) - Volume 3
A tuneable software cache coherence protocol for heterogeneous MPSoCs

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
A programming model for deterministic task parallelism

Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness

Quantified Score

Hi-index	0.01

Visualization

Abstract

The effective management of caches is critical to the performance of applications on shared-memory multiprocessors. In this paper, we discuss a technique for software cache coherence tht is based upon the integration of a program-level abstraction for shared data with software cache management. The program-level abstraction, called Shared Regions, explicitly relates synchronization objects with the data they protect. Cache coherence algorithms are presented which use the information provided by shared region primitives, and ensure that shared regions are always cacheable by the processors accessing them. Measurements and experiments of the Shared Region approach on a shared-memory multiprocessors accessing them. Measurements and experiments of the Shared Region approach on a shared-memory multiprocessor are shown. Comparisons with other software based coherence strategies, including a user-controlled strategy and an operating system-based strategy, show that this approach is able to deliver better performance, with relatively low corresponding overhead and only a small increase in the programming effort. Compared to a compiler-based coherence strategy, the Shared Regions approach still performs better than a compiler that can achieve 90% accuracy in allowing cacheing, as long as the regions are a few hundred bytes or larger, or they are re-used a few times in the cache.