CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches

Authors:
Marios Kleanthous;Yiannakis Sazeides
Affiliations:
University of Cyprus, Nicosia, Cyprus;University of Cyprus, Nicosia, Cyprus
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2011

Citing 21
Cited 0

An effective on-chip preloading scheme to reduce data access penalty

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Hitting the memory wall: implications of the obvious

ACM SIGARCH Computer Architecture News
Optimization of instruction fetch mechanisms for high issue rates

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Value locality and load value prediction

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improving code density using compression techniques

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Enhanced code compression for embedded RISC processors

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Selective instruction compression for memory energy reduction in embedded systems

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Compiler techniques for code compaction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Virtual Memory

ACM Computing Surveys (CSUR)
Frequent value locality and value-centric data cache design

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Using Slicing to Identify Duplication in Source Code

SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Survey of code-size reduction methods

ACM Computing Surveys (CSUR)
Adaptive Cache Compression for High-Performance Processors

Proceedings of the 31st annual international symposium on Computer architecture
A compressed memory hierarchy using an indirect index cache

WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Improving Program Efficiency by Packing Instructions into Registers

Proceedings of the 32nd annual international symposium on Computer Architecture
Trace Cache Sampling Filter

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Address Correlation: Exceeding the Limits of Locality

IEEE Computer Architecture Letters
CATCH: a mechanism for dynamically detecting Cache-Content-Duplication and its application to instruction caches

Proceedings of the conference on Design, automation and test in Europe
Multi-execution: multicore caching for data-similar executions

Proceedings of the 36th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cache-content-duplication (CCD) occurs when there is a miss for a block in a cache and the entire content of the missed block is already in the cache in a block with a different tag. Caches aware of content-duplication can have lower miss penalty by fetching, on a miss to a duplicate block, directly from the cache instead of accessing lower in the memory hierarchy, and can have lower miss rates by allowing only blocks with unique content to enter a cache. This work examines the potential of CCD for instruction caches. We show that CCD is a frequent phenomenon and that an idealized duplication-detection mechanism for instruction caches has the potential to increase performance of an out-of-order processor, with a 16KB, 8-way, 8 instructions per block instruction cache, often by more than 10% and up to 36%. This work also proposes CATCH, a hardware mechanism for dynamically detecting CCD for instruction caches. Experimental results for an out-of-order processor show that a duplication-detection mechanism with a 1.38KB cost captures on average 58% of the CCD's idealized potential.