Performance evaluation of directory protocols on an optical broadcast-based distributed shared memory multiprocessor

Authors:
İpek Abasıkeleş;M. Fatih Akay
Affiliations:
Computer Engineering Department, Çukurova University, 01330 Adana, Turkey;Computer Engineering Department, Çukurova University, 01330 Adana, Turkey
Venue:
Computers and Electrical Engineering
Year:
2010

Citing 17
Cited 2

Distributed Shared Memory: A Survey of Issues and Algorithms

Computer - Distributed computing systems: separate resources acting as one
SPEED DMON: cache coherence on an optical multichannel interconnect architecture

Journal of Parallel and Distributed Computing - Special issue on parallel computing with optical interconnects
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Using prediction to accelerate coherence protocols

Proceedings of the 25th annual international symposium on Computer architecture
Designing and evaluating a cost-effective optical network for multiprocessors

Journal of Parallel and Distributed Computing
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
A Simulation Study of Hardware-Oriented DSM Approaches

IEEE Parallel & Distributed Technology: Systems & Technology
Gemini: An Optical Interconnection Network for Parallel Processing

IEEE Transactions on Parallel and Distributed Systems
The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Scalable Cache Coherent Scheme Exploiting Wormhole Routing Networks

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Packetization and routing analysis of on-chip multiprocessor networks

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Networks on chip
Shared memory computing on clusters with symmetric multiprocessors and system area networks

ACM Transactions on Computer Systems (TOCS)
Design of a high-speed optical interconnect for scalable shared memory multiprocessors

HOTI '04 Proceedings of the High Performance Interconnects, 2004. on Proceedings. 12th Annual IEEE Symposium
An Evaluation of the Oak Ridge National Laboratory Cray XT3

International Journal of High Performance Computing Applications
In-network cache coherence

IEEE Computer Architecture Letters
Performance evaluation of the augmented data vortex switch fabric: An all-optical packet switched interconnection network

Optical Switching and Networking

Analysis and compute of real-time signal flow delay for network on-chip

Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing
A novel approach to enhance distributed virtual memory

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent advances in the development of optical technologies suggest the possible emergence of broadcast-based optical interconnects within cache-coherent distributed shared memory (DSM) multiprocessor architectures. It is well known that the cache-coherence protocol is a critical issue in designing such architectures because it directly affects memory latencies. In this paper, we evaluate via simulation the performance of three directory-based cache-coherence protocols; strict request-response, intervention forwarding and reply forwarding on the Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus), which is a low-latency and high-bandwidth broadcast-based fiber-optic interconnection network supporting DSM. The simulated system contains 64 nodes, each of which has a processor, a cache controller, a directory controller and an output channel. Simulations have been conducted for each protocol to measure average processor utilization, average network latency and average number of packets transferred over the network for varying values of the important DSM parameters such as the ratio of the mean channel service time to mean thread run time (T/R), probability of a cache block being in modified state {P(M)}, the fraction of write misses {P(W)} and home node contention rate. The results reveal that for all cases, except for low values of P(M), intervention forwarding gives the worst performance (lowest processor utilization and highest latency). The performance of strict request-response and reply forwarding is comparable for several values of the DSM parameters and contention rate. For a contention rate of 0%, the increase of P(M) makes reply forwarding perform better than strict request-response. The performance of all protocols decreases with the increase of P(W) and contention rate. However, the performance of strict request-response is the least affected among other protocols due to the negative impact of the increase of P(W) and contention rate. Therefore, for the full contention case (i.e. contention rate of 100%); for low values of P(M), or for mid values of P(M) and high values of P(W), strict request-response performs better than reply forwarding. These results are significant in the sense that they provide an insight to multiprocessor architecture designers for comparing the performance of different directory-based cache-coherence protocols on a broadcast-based interconnection network for different values of the DSM parameters and varying rates of contention.