Performance analysis of methods that overcome false sharing effects in software DSMs

  • Authors:
  • Manjunath Kudlur;R. Govindarajan

  • Affiliations:
  • Department of Computer Science and Automation, Indian Institute of Science, Bangalore, 560 012, India and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arb ...;Department of Computer Science and Automation, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560 012, India

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Page-based software DSMs experience high degrees of false sharing especially in irregular applications with fine grain sharing granularity. The overheads due to false sharing is considered to be a dominant factor limiting the performance of software DSMs. Several approaches have been proposed in the literature to reduce/eliminate false sharing. In this paper, we evaluate two of these approaches, viz., the Multiple Writer approach and the emulated fine grain sharing (EmFiGS) approach. Our evaluation strategy is two pronged. First, we use an implementation-independent analysis that uses overhead counts to compare the different approaches. Our analysis show that the benefits gained by eliminating false sharing are far outweighed by the performance penalty incurred due to the reduced exploitation of spatial locality in the EmFiGS approach. As a consequence, any implementation of the EmFiGS approach is likely to perform significantly worse than the Multiple Writer approach. Second, we use experimental evaluation to validate and complement our analysis. The experimental results match well with our analysis. Also the execution times of the application follow the same trend as in our analysis, reinforcing our conclusions. More specifically, the performance of the EmFiGS approach is significantly worse, by a factor of 1.5 to as much as 90 times, compared to the Multiple Writer approach. In many cases, the EmFiGS approach performs worse than even a single writer lazy release protocol which experiences very high overheads due to false sharing. The performance of the EmFiGS approach remains worse than the Multiple Writer approach even after incorporating Tapeworm--a record and replay technique that fetches pages ahead of demand in an aggregated fashion--to alleviate the spatial locality effect. We next present the effect of asynchronous message handling on the performance of different methods. Finally, we investigate the inter-play between spatial locality exploitation and false sharing elimination with varying sharing granularities in the EmFiGS approach and report the tradeoffs.