Automatic software cache coherence through vectorization

Authors:
Ervan Darnell;John M. Mellor-Crummey;Ken Kennedy
Affiliations:
-;-;-
Venue:
ICS '92 Proceedings of the 6th international conference on Supercomputing
Year:
1992

Citing 11
Cited 5

Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
Guide to parallel programming on Sequent computer systems: 2nd edition

Guide to parallel programming on Sequent computer systems: 2nd edition
Compiler-Directed Cache Management in Multiprocessors

Computer
Cache coherence in systems with parallel communication channels & many processors

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
An efficient caching support for critical sections in large-scale shared-memory multiprocessors

ICS '90 Proceedings of the 4th international conference on Supercomputing
The design and development of a very high speed system bus—the encore Mutlimax nanobus

ACM '86 Proceedings of 1986 ACM Fall joint computer conference
The directory-based cache coherence protocol for the DASH multiprocessor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Dependence graphs and compiler optimizations

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Solving Linear Systems on Vector and Shared Memory Computers

Solving Linear Systems on Vector and Shared Memory Computers
Structure of Computers and Computations

Structure of Computers and Computations

Cache coherence using local knowledge

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Exploiting cache affinity in software cache coherence

ICS '94 Proceedings of the 8th international conference on Supercomputing
Eliminating Stale Data References through Array Data-Flow Analysis

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Exact Distributed Invalidation

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Towards general and exact distributed invalidation

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.02

Visualization

Abstract

Access latency in large-scale shared-memory multiprocessors is a concern since most (if not all) memory is one or more hops away through an interconnection network. Providing processors with one or more levels of cache is an accepted way to reduce the average access latency; however, in a multiprocessor, cached values must be kept coherent for the multiprocessor to support the abstraction of a shared global memory. There is no generally accepted hardware solution to provde cache coherence for large-scale shared-memory multiprocessors. Software coherence strategies offer scalability with current hardware. In this paper we examine a compiler-based software strategy for maintaining cache coherence that relies on dependence analysis and a vectorization algorithm to insert cache control directives. Experiments on the BBN TC2000 for a pair of numerical problems show that the run-time cost of coherence using our strategy is less than that for previously proposed compiler-based software methods and suggest that it should compare favorably with proposed hardware schemes.