Relaxed consistency and coherence granularity in DSM systems: a performance evaluation

  • Authors:
  • Yuanyuan Zhou;Liviu Iftode;Jaswinder Pal Sing;Kai Li;Brian R. Toonen;Ioannis Schoinas;Mark D. Hill;David A. Wood

  • Affiliations:
  • Computer Science Department, Princeton University, Princeton, NJ;Computer Science Department, Princeton University, Princeton, NJ;Computer Science Department, Princeton University, Princeton, NJ;Computer Science Department, Princeton University, Princeton, NJ;Computer Sciences Department, University of Wisconsin, Madison, Madison, WI;Computer Sciences Department, University of Wisconsin, Madison, Madison, WI;Computer Sciences Department, University of Wisconsin, Madison, Madison, WI;Computer Sciences Department, University of Wisconsin, Madison, Madison, WI

  • Venue:
  • PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 1997

Quantified Score

Hi-index 0.01

Visualization

Abstract

During the past few years, two main approaches have been taken to improve the performance of software shared memory implementations: relaxing consistency models and providing fine-grained access control. Their performance tradeoffs, however, we not well understood. This paper studies these tradeoffs on a platform that provides access control in hardware but runs coherence protocols in software, We compare the performance of three protocols across four coherence granularities, using 12 applications on a 16-node cluster of workstations. Our results show that no single combination of protocol and granularity performs best for all the applications. The combination of a sequentially consistent (SC) protocol and fine granularity works well with 7 of the 12 applications. The combination of a multiple-writer, home-based lazy release consistency (HLRC) protocol and page granularity works well with 8 out of the 12 applications. For applications that suffer performance losses in moving to coarser granularity under sequential consistency, the performance can usually be regained quite effectively using relaxed protocols, particularly HLRC. We also find that the HLRC protocol performs substantially better than a single-writer lazy release consistent (SW-LRC) protocol at coase granularity for many irregular applications. For our applications and platform, when we use the original versions of the applications ported directly from hardware-coherent shared memory, we find that the SC protocol with 256-byte granularity performs best on average. However, when the best versions of the applications are compared, the balance shifts in favor of HLRC at page granularity.