A memory-layout oriented run-time technique for locality optimization

  • Authors:
  • Yong Yan;Xiaodong Zhang;Zhao Zhang

  • Affiliations:
  • -;-;-

  • Venue:
  • ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout oriented approach to exploit cache locality for parallel loops at run-time on Symmetric Multi-Processor (SMP) systems. Guided by application dependent hints and the targeted cache architecture, it reorganizes and partitions a parallel loop through shrinking and partitioning the memory access space of the loop at run-time. In the generated task partitions, the data sharing among partitions is minimized and data reuse in a partition is maximized. The execution of tasks in partitions is scheduled in an adaptive and locality-preserved way to achieve balanced execution, for minimizing the execution time of applications by trading off load balance and locality.Based on simulation and measurement, we show our run-time approach can achieve comparable performance with the compiler optimizations for two applications, whose load balance and cache locality can be well optimized by the tiling and other program transformations. However, our experimental results also show that our approach is able to significantly improve the memory performance for the applications with dynamic memory access patterns. This type of programs are usually hard to be optimized by compilers.