Improving shared cache behavior of multithreaded object-oriented applications in multicores

  • Authors:
  • Mahmut Kandemir;Shekhar Srikantaiah;Seung Woo Son

  • Affiliations:
  • Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Argonne National Lab, Argonne, IL

  • Venue:
  • Proceedings of the International Conference on Computer-Aided Design
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding shared cache performance when executing multithreaded object-oriented applications and optimizing these applications for multicores have not received much attention. In this paper, we first quantify the intra-thread and inter-thread cache line (block) reuse characteristics of a set of multithreaded C++ programs when executed in shared cache based multicores. Our results show that, as far as shared on-chip caches are concerned, inter-thread cache line (block) reuse distances are much higher than intra-thread cache line reuse distances. We study the impact of these characteristics on the hit/miss behavior of the shared last-level cache on a commercial multicore machine. We then show that, by rearranging accesses to the objects shared across different threads and to the objects stored in nearby memory locations, inter-thread (temporal and spatial) object reuse distances can be reduced, which in turn helps to reduce inter-thread cache line reuse distances. The results we collected using eight multithreaded applications show that our proposed shared cache-aware code restructuring strategy can reduce misses in the last-level on-chip cache of a commercial multicore machine by 25.4%, on average. These savings in cache misses translate in turn to average execution time improvement of 11.9%.