A distributed shared memory multiprocessor ASURA: memory and cache architecture
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
The GLOW cache coherence protocol extensions for widely shared data
ICS '96 Proceedings of the 10th international conference on Supercomputing
A study of three dynamic approaches to handle widely shared data in shared-memory multiprocessors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Modeling Live and Dead Lines in Cache Memory Systems
IEEE Transactions on Computers
Kiloprocessor Extensions to SCI
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A computer architecture with access control and cache option tags on individual instruction operands
ACM SIGARCH Computer Architecture News
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ATAC: a 1000-core cache-coherent processor with on-chip optical network
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Building expressive, area-efficient coherence directories
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Using in-flight chains to build a scalable cache coherence protocol
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Cache coherence problem is a major issue in the design of shared-memory multiprocessors. As the number of processors grows, traditional bus-based snoopy schemes for cache coherence are no longer adequate. Instead, the directory-based scheme is a promising alternative for the large-scale cache coherence problem. However, the storage overhead of (full-map) directory scheme may become too prohibitive as the system size goes up. This paper presents two distributed directory schemes, the tree directory and the hierarchical full-map directory, to deal with the storage overhead problem. Preliminary trace-driven evaluations show that the performance of our schemes compares favorably to the full-map directory scheme, while reducing the storage overhead by over 90%. These two schemes should lend themselves to the design and implementation of large-scale cache coherent multiprocessors.