Techniques for reducing consistency-related communication in distributed shared-memory systems

  • Authors:
  • John B. Carter;John K. Bennett;Willy Zwaenepoel

  • Affiliations:
  • Department of Computer Science, University of Utah, 3190, Merrill Engineering Building, Salt Lake City, UT;Computer Systems Laboratory, Rice University, Houston, TX;Computer Systems Laboratory, Rice University, Houston, TX

  • Venue:
  • ACM Transactions on Computer Systems (TOCS)
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed shared memory (DSM) is an abstraction of shared memory on a distributed-memory machine. Hardware DSM systems support this abstraction at the architecture level; software DSM systems support the abstraction within the runtime system. One of the key problems in building an efficient software DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. In this article we present four techniques for doing so: software release consistency; multiple consistency protocols; write-shared protocols; and an update-with-timeout mechanism. These techniques have been implemented in the Munin DSM system. We compare the performance of seven Munin application programs: first to their performance when implemented using message passing, and then to their performance when running on a conventional software DSM system that does not embody the preceding techniques. On a 16-processor cluster of workstations, Munin's performance is within 5% of message passing for four out of the seven applications. For the other three, performance is within 29 to 33%. Detailed analysis of two of these three applications indicates that the addition of a function-shipping capability would bring their performance to within 7% of the message-passing performance. Compared to a conventional DSM system, Munin achieves performance improvements ranging from a few to several hundred percent, depending on the application.