Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer

  • Authors:
  • Livio Soares;David Tam;Michael Stumm

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of Toronto, Canada;Department of Electrical and Computer Engineering, University of Toronto, Canada;Department of Electrical and Computer Engineering, University of Toronto, Canada

  • Venue:
  • Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is well recognized that LRU cache-line replacement can be ineffective for applications with large working sets or non-localized memory access patterns. Specifically, in last-level processor caches, LRU can cause cache pollution by inserting non-reuseable elements into the cache while evicting reusable ones. The work presented in this paper addresses last-level cache pollution through a dynamic operating system mechanism, called ROCS, requiring no change to underlying hardware and no change to applications. ROCS employs hardware performance counters on a commodity processor to characterize application cache behavior at run-time. Using this online profiling, cache unfriendly pages are dynamically mapped to a pollute buffer in the cache, eliminating competition between reusable and non-reusable cache lines. The operating system implements the pollute buffer through a page-coloring based technique, by dedicating a small slice of the last-level cache to store non-reusable pages. Measurements show that ROCS, implemented in the Linux 2.6.24 kernel and running on a 2.3GHz PowerPC 970FX, improves performance of memory-intensive SPEC CPU 2000 and NAS benchmarks by up to 34%, and 16% on average.