Defensive loop tiling for multi-core processor

  • Authors:
  • Bin Bao;Xiaoya Xiang

  • Affiliations:
  • University of Rochester, Rochester, NY;University of Rochester

  • Venue:
  • Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Loop tiling is a compiler transformation that tailors an application's working set to fit in a cache hierarchy. On today's multicore processors, part of the hierarchy, especially the last level cache (LLC) is shared. In this paper, we show that cache sharing requires special types of tiling depending on the co-run programs. We analyze the reasons for the performance difference and give a defensive strategy that performs consistently the best or near the best. For example, when compared with conservative tiling, which tiles for private cache, the performance of defensive tiling is similar in solo-runs but up to 20% higher in program co-runs, when tested on an Intel multicore processor.