Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
The Stanford Dash Multiprocessor
Computer
Comparative performance evaluation of cache-coherent NUMA and COMA architectures
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Evaluating the memory overhead required for COMA architectures
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Distributed Shared Memory: Concepts and Systems
Distributed Shared Memory: Concepts and Systems
The Cache Coherence Problem in Shared-Memory Multiprocessors: Software Solutions
The Cache Coherence Problem in Shared-Memory Multiprocessors: Software Solutions
The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions
The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions
Merging, sorting and matrix operations on the SOME-bus multiprocessor architecture
Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
Fault-Tolerant Distributed Shared Memory on a Broadcast-Based Architecture
IEEE Transactions on Parallel and Distributed Systems
Computers and Electrical Engineering
Hi-index | 0.00 |
This article comparatively evaluates four hardware implementations of distributed shared memory (DSM): two CC-NUMA (Dash and SCI) and two COMA (XSR1 and DDM) architectures. Our analysis compares approaches, rather than implementations. So, we assumed a hierarchical two-level cluster-based system with a uniform bus-based cluster structure on the first level. We simulated the DSM mechanisms of the four approaches on the second level. The simulation methodology was based on synthetic address traces, because it was the most suitable for this study. The comparison covered a large variety and a broad range of system-oriented, application-oriented, and technology-oriented parameters. The results show that COMA protocols are somewhat more efficient, because of the dynamic migration of responsibility for shared data. Also, the available interconnection network bandwidth greatly affects system scalability (ring-based systems achieve almost-linear speedup).