An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Munin: distributed shared memory based on type-specific memory coherence
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The Stanford Dash Multiprocessor
Computer
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Cache Invalidation Patterns in Shared-Memory Multiprocessors
IEEE Transactions on Computers
The design and analysis of DASH: a scalable directory-based multiprocessor
The design and analysis of DASH: a scalable directory-based multiprocessor
Cooperative shared memory: software and hardware for scalable multiprocessor
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Integrating message-passing and shared-memory: early experience
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
CACHE COHERENCE PROTOCOLS FOR LARGE-SCALE MULTIPROCESSORS
CACHE COHERENCE PROTOCOLS FOR LARGE-SCALE MULTIPROCESSORS
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS)
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Compiling for shared-memory and message-passing computers
ACM Letters on Programming Languages and Systems (LOPLAS)
An evaluation of directory protocols for medium-scale shared-memory multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
Cost/performance of a parallel computer simulator
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Software-extended coherent shared memory: performance and cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
LCM: memory system support for parallel language implementation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
On characterizing bandwidth requirements of parallel applications
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
CRL: high-performance all-software distributed shared memory
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Decoupled hardware support for distributed shared memory
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Synchronization hardware for networks of workstations: performance vs. cost
ICS '96 Proceedings of the 10th international conference on Supercomputing
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Using prediction to accelerate coherence protocols
Proceedings of the 25th annual international symposium on Computer architecture
Retrospective: tempest and typhoon: user-level shared memory
25 years of the international symposia on Computer architecture (selected papers)
An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors
IEEE Transactions on Computers
ADir_pNB: A Cost-Effective Way to Implement Full Map Directory-Based Cache Coherence Protocols
IEEE Transactions on Computers
Application-specific protocols for user-level shared memory
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Abstracting network characteristics and locality properties of parallel systems
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Journal of Systems Architecture: the EUROMICRO Journal
A new perspective for efficient virtual-cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.01 |
This paper explores the complexity of implementing directory protocols by examining their mechanisms primitive operations on directories, caches, and network interfaces. We compare the following protocols: Dir1B, Dir4B, Dir4NB, DirnNB[2], Dir1SW[9] and an improved version of Dir1SW (Dir1SW+). The comparison shows that the mechanisms and mechanism sequencing of Dir1SW and Dir1SW+ are simpler than those for other protocols. We also compare protocol performance by running eight benchmarks on 32 processor systems. Simulations show that Dir1SW+s performance is comparable to more complex directory protocols. The significant disparity in hardware complexity and the small difference in performance argue that Dir1SW+ may be a more effective use of resources. The small performance difference is attributable to two factors: the low degree of sharing in the benchmarks and Check- In/Check-Out (CICO) directives [9].