Global management of cache hierarchies

Authors:
Mohamed Zahran;Sally A. McKee
Affiliations:
City University of New York, New York, NY, USA;Chalmers University of Technology, Gothenburg, Sweden
Venue:
Proceedings of the 7th ACM international conference on Computing frontiers
Year:
2010

Citing 15
Cited 4

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Run-Time Cache Bypassing

IEEE Transactions on Computers
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Dynamic Access Ordering for Streamed Computations

IEEE Transactions on Computers
A Single-Chip Multiprocessor

Computer
Dynamically Controlled Resource Allocation in SMT Processors

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Maximizing CMP Throughput with Mediocre Cores

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
A Case for MLP-Aware Cache Replacement

Proceedings of the 33rd annual international symposium on Computer Architecture
POWER5 System microarchitecture

IBM Journal of Research and Development - POWER5 and packaging
Performance evaluation of exclusive cache hierarchies

ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive insertion policies for high performance caching

Proceedings of the 34th annual international symposium on Computer architecture
PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches

Proceedings of the 36th annual international symposium on Computer architecture
POWER4 system microarchitecture

IBM Journal of Research and Development
Blue Gene/L compute chip: memory and Ethernet subsystem

IBM Journal of Research and Development

On the theory and potential of LRU-MRU collaborative cache management

Proceedings of the international symposium on Memory management
A generalized theory of collaborative caching

Proceedings of the 2012 international symposium on Memory Management
Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Temporal-based multilevel correlating inclusive cache replacement

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Cache memories currently treat all blocks as if they were equally important. This assumption of equally important blocks is not always valid. For instance, not all blocks deserve to be in L1 cache. We therefore propose globalized block placement. We present a global placement algorithm for managing blocks in a cache hierarchy by deciding where in the hierarchy an incoming block should be placed. Our technique makes decisions by adapting to access patterns of different blocks. The contributions of this paper are fourfold. First, we motivate our solution by demonstrating the importance of a globalized placement scheme. Second, we present a method to categorize cache block behavior into one of four categories. Third, we present one potential design exploiting this categorization. Finally, we demonstrate the performance of our design. The proposed scheme enhances overall system performance (IPC) by an average of 12% over a traditional LRU scheme while reducing traffic between L1 cache and L2 cache by an average of 20%, using SPEC CPU benchmark suite. All of this is achieved with a table as small as 3 KB.