Distributed Shared Memory: A Survey of Issues and Algorithms
Computer - Distributed computing systems: separate resources acting as one
SPEED DMON: cache coherence on an optical multichannel interconnect architecture
Journal of Parallel and Distributed Computing - Special issue on parallel computing with optical interconnects
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Using prediction to accelerate coherence protocols
Proceedings of the 25th annual international symposium on Computer architecture
Designing and evaluating a cost-effective optical network for multiprocessors
Journal of Parallel and Distributed Computing
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
A Simulation Study of Hardware-Oriented DSM Approaches
IEEE Parallel & Distributed Technology: Systems & Technology
Gemini: An Optical Interconnection Network for Parallel Processing
IEEE Transactions on Parallel and Distributed Systems
The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Scalable Cache Coherent Scheme Exploiting Wormhole Routing Networks
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Packetization and routing analysis of on-chip multiprocessor networks
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Networks on chip
Shared memory computing on clusters with symmetric multiprocessors and system area networks
ACM Transactions on Computer Systems (TOCS)
Design of a high-speed optical interconnect for scalable shared memory multiprocessors
HOTI '04 Proceedings of the High Performance Interconnects, 2004. on Proceedings. 12th Annual IEEE Symposium
An Evaluation of the Oak Ridge National Laboratory Cray XT3
International Journal of High Performance Computing Applications
IEEE Computer Architecture Letters
Optical Switching and Networking
Analysis and compute of real-time signal flow delay for network on-chip
Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing
A novel approach to enhance distributed virtual memory
Computers and Electrical Engineering
Hi-index | 0.00 |
Recent advances in the development of optical technologies suggest the possible emergence of broadcast-based optical interconnects within cache-coherent distributed shared memory (DSM) multiprocessor architectures. It is well known that the cache-coherence protocol is a critical issue in designing such architectures because it directly affects memory latencies. In this paper, we evaluate via simulation the performance of three directory-based cache-coherence protocols; strict request-response, intervention forwarding and reply forwarding on the Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus), which is a low-latency and high-bandwidth broadcast-based fiber-optic interconnection network supporting DSM. The simulated system contains 64 nodes, each of which has a processor, a cache controller, a directory controller and an output channel. Simulations have been conducted for each protocol to measure average processor utilization, average network latency and average number of packets transferred over the network for varying values of the important DSM parameters such as the ratio of the mean channel service time to mean thread run time (T/R), probability of a cache block being in modified state {P(M)}, the fraction of write misses {P(W)} and home node contention rate. The results reveal that for all cases, except for low values of P(M), intervention forwarding gives the worst performance (lowest processor utilization and highest latency). The performance of strict request-response and reply forwarding is comparable for several values of the DSM parameters and contention rate. For a contention rate of 0%, the increase of P(M) makes reply forwarding perform better than strict request-response. The performance of all protocols decreases with the increase of P(W) and contention rate. However, the performance of strict request-response is the least affected among other protocols due to the negative impact of the increase of P(W) and contention rate. Therefore, for the full contention case (i.e. contention rate of 100%); for low values of P(M), or for mid values of P(M) and high values of P(W), strict request-response performs better than reply forwarding. These results are significant in the sense that they provide an insight to multiprocessor architecture designers for comparing the performance of different directory-based cache-coherence protocols on a broadcast-based interconnection network for different values of the DSM parameters and varying rates of contention.