Analysis of cache memories in highly parallel systems
Analysis of cache memories in highly parallel systems
Multiprocessor cache design considerations
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Coherency for multiprocessor virtual address caches
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A cache coherence scheme with fast selective invalidation
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Compiler-directed cache coherence strategies for large-scale shared-memory multiprocessor systems
Compiler-directed cache coherence strategies for large-scale shared-memory multiprocessor systems
A software coherence scheme with the assistance of directories
ICS '91 Proceedings of the 5th international conference on Supercomputing
A version control approach to Cache coherence
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Implementing a cache consistency protocol
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps
IEEE Transactions on Parallel and Distributed Systems
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
An economical solution to the cache coherence problem
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors
Cache coherence using local knowledge
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Exploiting cache affinity in software cache coherence
ICS '94 Proceedings of the 8th international conference on Supercomputing
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
A compiler-directed cache coherence scheme with improved intertask locality
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Techniques for Compiler-Directed Cache Coherence
IEEE Parallel & Distributed Technology: Systems & Technology
Classifying Software-Based Cache Coherence Solutions
IEEE Software
Eliminating Stale Data References through Array Data-Flow Analysis
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Exact Distributed Invalidation
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Towards general and exact distributed invalidation
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
In this paper, a cache coherence strategy with a combined software and hardware approach is proposed for large-scale multiprocessor systems. The new strategy has the scalability advantages of existing software strategies and does not rely on shared hardware resources to maintain coherence. It exploits as much intra-task temporal locality as previously proposed low-cost, compiler-based strategies such as Simple Invalidation and Fast Selective Invalidation. With a small amount of additional hardware and a small set of cache management instructions, the new strategy preserves more inter-task-level temporal locality than these strategies. It is an economical alternative and has potential performance close to that of more elaborate strategies such as Version Control and Time Stamp. Also, the new strategy is easily extendable to include Doacross loops.