Approximate Analysis of Single and Multiple Ring Networks
IEEE Transactions on Computers
Analysis and Comparison of Cache Coherence Protocols for a Packet-Switched Multiprocessor
IEEE Transactions on Computers
Memory Access Dependencies in Shared-Memory Multiprocessors
IEEE Transactions on Software Engineering
Analysis of multithreaded architectures for parallel computing
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Hector: A Hierarchically Structured Shared-Memory Multiprocessor
Computer - Special issue on experimental research in computer architecture
Comparative evaluation of latency reducing and tolerating techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
A methodology for performance evaluation of parallel applications on multiprocessors
Journal of Parallel and Distributed Computing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache consistency in hierarchical-ring-based multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The performance of cache-coherent ring-based multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
CSIM: a C-based process-oriented simulation language
WSC '86 Proceedings of the 18th conference on Winter simulation
Proceedings of the 2001 ACM symposium on Applied computing
DRACO: optimized CC-NUMA system with novel dual-link interconnections to reduce the memory latency
MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
The Power of Priority: NoC Based Distributed Cache Coherency
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Modeling and evaluation of ring-based interconnects for Network-on-Chip
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 14.98 |
As microprocessor speeds continue to improve at a very fast rate the bandwidth requirements for system level interconnections in multiprocessors may eventually rule out the use of shared buses even for small scale multiprocessors. On the other hand high-speed unidirectional links are an emerging technology that has the potential to scale with microprocessor technology and could replace buses as the interconnection fabric for future multiprocessors. In this paper we evaluate the performance of the unidirectional slotted ring interconnection for small to medium scale shared memory systems, using a hybrid methodology of analytical models and trace-driven simulations. We use memory traces from actual execution of parallel programs to drive detailed event-driven simulations of a variety of ring and bus multiprocessors. Snooping and directory coherence protocols for the slotted ring are evaluated in the context of multitasking. Snooping is shown to outperform full-map and linked list directory schemes in the unidirectional slotted ring, and it also compares favorably to high-performance split-transaction bus systems.