An Iteration Partition Approach for Cache or Local Memory Thrashing on Parallel Processing

  • Authors:
  • J. Z. Fang;M. Lu

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1993

Quantified Score

Hi-index 14.98

Visualization

Abstract

Parallel processing systems with cache or local memory in the memory hierarchies are considered. These systems have a local cache memory in each processor and usually employ a write-invalidate protocol for the cache coherence. In such systems, a problem called 'cache or local memory thrashing' can arise in executions of parallel programs, when the data unnecessarily moves back and forth between the caches or local memories in different processors. An approach to eliminate, or at least to reduce, such movement for nested parallel loops is presented. It is based on relations between array element accesses and enclosed loop indexes in the loops. The relations can be used to assign processors to execute the appropriate iterations for parallel loops in the loop nests with respect to the data in their caches or local memories. An algorithm for calculating the correct iteration of the parallel loop in terms of loop indexes of the previous iterations executed in the processor is presented. This method benefits parallel code with nested loop structures in a wide range of applications. The experimental results show that the technique can achieve speedups up to 2.