Capturing dynamic memory reference behavior with adaptive cache topology

Authors:
Jih-Kwon Peir;Yongjoon Lee;Windsor W. Hsu
Affiliations:
Computer & Information Science & Engineering Department, University of Florida, Gainesville, FL;Computer & Information Science & Engineering Department, University of Florida, Gainesville, FL;Computer Science Division, University of California, Berkeley, CA
Venue:
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Year:
1998

Citing 20
Cited 17

Cache design of a sub-micron CMOS system/370

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Cache Operations by MRU Change

IEEE Transactions on Computers
Cache performance of operating system and multiprogramming workloads

ACM Transactions on Computer Systems (TOCS)
A Case for Direct-Mapped Caches

Computer
A case for two-way skewed-associative caches

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Evaluating stream buffers as a secondary cache replacement

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tradeoffs in two-level on-chip caching

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Cache designs with partial address matching

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The difference-bit cache

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Improving cache performance with balanced tag and data paths

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Eliminating cache conflict misses through XOR-based placement functions

ICS '97 Proceedings of the 11th international conference on Supercomputing
Prefetching using Markov predictors

Proceedings of the 24th annual international symposium on Computer architecture
Run-time adaptive cache hierarchy management via reference analysis

Proceedings of the 24th annual international symposium on Computer architecture
LRU-based column-associative caches

ACM SIGARCH Computer Architecture News
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Cache Memories

ACM Computing Surveys (CSUR)
Two Fast and High-Associativity Cache Schemes

IEEE Micro
DASC cache

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Predictive sequential associative cache

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture

Functional Implementation Techniques for CPU Cache Memories

IEEE Transactions on Computers - Special issue on cache memory and related problems
Reducing cache misses using hardware and software page placement

ICS '99 Proceedings of the 13th international conference on Supercomputing
A fully associative software-managed cache design

Proceedings of the 27th annual international symposium on Computer architecture
Dead-block prediction & dead-block correlating prefetchers

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Cache decay: exploiting generational behavior to reduce cache leakage power

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Let caches decay: reducing leakage energy via exploitation of cache generational behavior

ACM Transactions on Computer Systems (TOCS)
Cache-Line Decay: A Mechanism to Reduce Cache Leakage Power

PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
The V-Way Cache: Demand Based Associativity via Global Replacement

Proceedings of the 32nd annual international symposium on Computer Architecture
An efficient direct mapped instruction cache for application-specific embedded systems

CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Balanced Cache: Reducing Conflict Misses of Direct-Mapped Caches

Proceedings of the 33rd annual international symposium on Computer Architecture
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Reducing cache misses through programmable decoders

ACM Transactions on Architecture and Code Optimization (TACO)
A novel cache architecture with enhanced performance and security

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Adaptive line placement with the set balancing cache

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Customized placement for high performance embedded processor caches

ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
A comparative analysis of performance improvement schemes for cache memories

Computers and Electrical Engineering
ASCIB: adaptive selection of cache indexing bits for removing conflict misses

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design

Quantified Score

Hi-index	0.00

Visualization

Abstract

Memory references exhibit locality and are therefore not uniformly distributed across the sets of a cache. This skew reduces the effectiveness of a cache because it results in the caching of a considerable number of less-recently-used lines which are less likely to be re-referenced before they are replaced. In this paper, we describe a technique that dynamically identifies these less-recently-used lines and effectively utilizes the cache frames they occupy to more accurately approximate the global least-recently-used replacement policy while maintaining the fast access time of a direct-mapped cache. We also explore the idea of using these underutilized cache frames to reduce cache misses through data prefetching. In the proposed design, the possible locations that a line can reside in is not predetermined. Instead, the cache is dynamically partitioned into groups of cache lines. Because both the total number of groups and the individual group associativity adapt to the dynamic reference pattern, we call this design the adaptive group-associative cache. Performance evaluation using trace-driven simulations of the TPC-C benchmark and selected programs from the SPEC95 benchmark suite shows that the group-associative cache is able to achieve a hit ratio that is consistently better than that of a 4-way set-associative cache. For some of the workloads, the hit ratio approaches that of a fully-associative cache.