Memory Contention in Scalable Cache-Coherent Multiprocessors

Authors:
Ricardo Bianchini;Mark E. Crovella;Leonidas Kontothanassis;Thomas J. LeBlanc
Affiliations:
-;-;-;-
Venue:
Memory Contention in Scalable Cache-Coherent Multiprocessors
Year:
1993

Citing 0
Cited 1

Predicting application behavior in large scale shared-memory multiprocessors

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Effective use of large-scale multiprocessors requires the elimination of all bottlenecks that reduce processor utilization. One such bottleneck is memory contention. In this paper we show that memory contention occurs in many parallel applications, when those applications are run on large-scale shared-memory multiprocessors. In our simulations of several parallel applications on a large-scale machine, we observed that some applications exhibit near-perfect speedup on hundreds of processors when the effect of memory contention is ignored, and exhibit no speedup at all when memory contention is considered. As the number of processors is increased, many applications exhibit an increase in both the number of hot spots and in the degree of contention for each hot spot. In addition, we observed that hot spots are spread throughout memory for some applications, and that eliminating hot spots on an individual basis can cause other hot spots to worsen. These observations suggest that modern multiprocessors require some mechanism to alleviate hot-spot contention. .pp We evaluate the effectiveness of two different mechanisms for dealing with hot-spot contention in direct-connected, distributed-shared-memory multiprocessors: queueing requests at the memory module, which allows a memory module to be more highly utilized during periods of contention, and increasing the effective bandwidth to memory by having the coherency protocol distribute the hot data to multiple memory modules. We show that queueing requires long queues at each memory module, and does not perform as well as our proposed coherency protocol, which essentially eliminates memory contention in the applications we consider.