Partitioning regular grid applications with irregular boundaries for cache-coherent multiprocessors

  • Authors:
  • Yang Zeng;Santosh G. Abraham

  • Affiliations:
  • -;-

  • Venue:
  • IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of partitioning applications that operate on a regular grid but have irregular boundaries for a cache-coherent multiprocessor. Domain decomposition techniques such as RSB have commonly been used to reduce interprocessor communication in message passing multiprocessors. We apply these partitioning algorithms on cache-coherent multiprocessors to reduce cache-coherency traffic. We find that the actual cache-coherency traffic is approximately double the estimated true coherency traffic, primarily due to false-sharing and the consequent false coherency traffic. We devise two techniques that eliminate false sharing traffic in partitions produced using the common domain decomposition algorithms. In our compensation algorithm, we modify the partition produced by the domain decomposition to ensure that all the nodes on a cache line are assigned to the same processor. In our coalescing algorithm, nodes belonging to the same cache line are coalesced into a single node and the weights on nodes and arcs adjusted to represent the overall computation and communication costs of the coalesced nodes. This coalesced graph is partitioned using a domain decomposition algorithm and then the coalesced nodes in the partition are expanded. Our experimental results using an Indian Ocean circulation application on the KSR1 multiprocessor demonstrate that compensation reduces coherency traffic by as much as 55% and execution time by up to 18% and that graph coalescing reduces coherency traffic by up to 74%.