Dynamic cache management in multi-core architectures through run-time adaptation

Authors:
Fazal Hameed;Lars Bauer;Jörg Henkel
Affiliations:
Karlsruhe Institute of Technology, Karlsruhe, Germany;Karlsruhe Institute of Technology, Karlsruhe, Germany;Karlsruhe Institute of Technology, Karlsruhe, Germany
Venue:
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Year:
2012

Citing 11
Cited 1

Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling

Proceedings of the 32nd annual international symposium on Computer Architecture
A Case for MLP-Aware Cache Replacement

Proceedings of the 33rd annual international symposium on Computer Architecture
Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Cooperative cache partitioning for chip multiprocessors

Proceedings of the 21st annual international conference on Supercomputing
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Challenges and Solutions for Late- and Post-Silicon Design

IEEE Design & Test
Adaptive insertion policies for managing shared caches

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches

Proceedings of the 36th annual international symposium on Computer architecture
Scaling the bandwidth wall: challenges in and avenues for CMP scaling

Proceedings of the 36th annual international symposium on Computer architecture
Off-chip memory bandwidth minimization through cache partitioning for multi-core platforms

Proceedings of the 47th Design Automation Conference

Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Non-Uniform Cache Access (NUCA) architectures provide a potential solution to reduce the average latency for the last-level-cache (LLC), where the cache is organized into per-core local and remote partitions. Recent research has demonstrated the benefits of cooperative cache sharing among local and remote partitions. However, ignoring cache access patterns of concurrently executing applications sharing the local and remote partitions can cause inter-partition contention that reduces the overall instruction throughput. We propose a dynamic cache management scheme for LLC in NUCA-based architectures, which reduces inter-partition contention. Our proposed scheme provides efficient cache sharing by adapting migration, insertion, and promotion policies in response to the dynamic requirements of the individual applications with different cache access behaviors. Our adaptive cache management scheme allows individual cores to steal cache capacity from remote partitions to achieve better resource utilization. On average, our proposed scheme increases the performance (instructions per cycle) by 28% (minimum 8.4%, maximum 75%) compared to a private LLC organization.