Reuse distance based performance modeling and workload mapping

  • Authors:
  • Sai Prashanth Muralidhara;Mahmut Kandemir;Orhan Kislal

  • Affiliations:
  • Pennsylvania State University, State College, PA, USA;Pennsylvania State University, State College, PA, USA;Pennsylvania State University, State College, PA, USA

  • Venue:
  • Proceedings of the 9th conference on Computing Frontiers
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern multicore architectures have multiple cores connected to a hierarchical cache structure resulting in heterogeneity in cache sharing across different subsets of cores. In these systems, overall throughput and efficiency depends heavily on a careful mapping of applications to available cores. In this paper, we study the problem of application-to-core mapping with the goal of trying to improve the overall cache performance in the presence of a hierarchical multi-level cache structure. We propose to sample the memory access patterns of individual applications and build their reuse distance distributions. Further, we propose to use these reuse distance distributions to compute an application-to-core mapping that tries to improve the overall cache performance, and consequently, the overall throughput. We show that our proposed mapping scheme is very effective in practice yielding throughput benefits of about 39% over the worst case mapping and about 30% over the default operating system based mapping. We believe, as larger chip multiprocessors with deeper cache hierarchies are projected to be the norm in the future, efficient mapping of applications to cores will become a vital requirement to extract the maximum possible performance from these systems.