Architectural support for operating system-driven CMP cache management

  • Authors:
  • Nauman Rafique;Won-Taek Lim;Mithuna Thottethodi

  • Affiliations:
  • Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN

  • Venue:
  • Proceedings of the 15th international conference on Parallel architectures and compilation techniques
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The role of the operating system (OS) in managing shared resources such as CPU time, memory, peripherals, and even energy is well motivated and understood [23]. Unfortunately, one key resource—lower-level shared cache in chip multi-processors—is commonly managed purely in hardware by rudimentary replacement policies such as least-recentlyused (LRU). The rigid nature of the hardware cache management policy poses a serious problem since there is no single best cache management policy across all sharing scenarios. For example, the cache management policy for a scenario where applications from a single organization are running under "best effort" performance expectation is likely to be different from the policy for a scenario where applications from competing business entities (say, at a third party data center) are running under a minimum service level expectation. When it comes to managing shared caches, there is an inherent tension between flexibility and performance. On one hand, managing the shared cache in the OS offers immense policy flexibility since it may be implemented in software. Unfortunately, it is prohibitively expensive in terms of performance for the OS to be involved in managing temporally fine-grain events such as cache allocation. On the other hand, sophisticated hardware-only cache management techniques to achieve fair sharing or throughput maximization have been proposed. But they offer no policy flexibility.This paper addresses this problem by designing architectural support for OS to efficiently manage shared caches with a wide variety of policies. Our scheme consists of a hardware cache quota management mechanism, an OS interface and a set of OS level quota orchestration policies. The hardware mechanism guarantees that OS-specified quotas are enforced in shared caches, thus eliminating the need for (and the performance penalty of) temporally fine-grained OS intervention. The OS retains policy flexibility since it can tune the quotas during regularly scheduled OS interventions. We demonstrate that our scheme can support a wide range of policies including policies that provide (a) passive performance differentiation, (b) reactive fairness by miss-rate equalization and (c) reactive performance differentiation.