The V-Way Cache: Demand Based Associativity via Global Replacement

Authors:
Moinuddin K. Qureshi;David Thompson;Yale N. Patt
Affiliations:
University of Texas at Austin;University of Texas at Austin;University of Texas at Austin
Venue:
Proceedings of the 32nd annual international symposium on Computer Architecture
Year:
2005

Citing 12
Cited 29

Cache performance of operating system and multiprogramming workloads

ACM Transactions on Computer Systems (TOCS)
A case for two-way skewed-associative caches

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Capturing dynamic memory reference behavior with adaptive cache topology

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache conscious programming in undergraduate computer science

SIGCSE '99 The proceedings of the thirtieth SIGCSE technical symposium on Computer science education
Sequentiality and prefetching in database systems

ACM Transactions on Database Systems (TODS)
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A fully associative software-managed cache design

Proceedings of the 27th annual international symposium on Computer architecture
Cache Memories

ACM Computing Surveys (CSUR)
Predictive sequential associative cache

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Analysis of cache replacement-algorithms

Analysis of cache replacement-algorithms
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Using Prime Numbers for Cache Indexing to Eliminate Conflict Misses

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture

A Case for MLP-Aware Cache Replacement

Proceedings of the 33rd annual international symposium on Computer Architecture
Cooperative Caching for Chip Multiprocessors

Proceedings of the 33rd annual international symposium on Computer Architecture
Architectural support for operating system-driven CMP cache management

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Two-level mapping based cache index selection for packet forwarding engines

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Investigating cache energy and latency break-even points in high performance processors

MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Unified microprocessor core storage

Proceedings of the 4th international conference on Computing frontiers
Adaptive insertion policies for high performance caching

Proceedings of the 34th annual international symposium on Computer architecture
Cooperative cache partitioning for chip multiprocessors

Proceedings of the 21st annual international conference on Supercomputing
Investigating cache energy and latency break-even points in high performance processors

ACM SIGARCH Computer Architecture News
YAARC: yet another approach to further reducing the rate of conflict misses

The Journal of Supercomputing
Less reused filter: improving l2 cache performance via filtering less reused lines

Proceedings of the 23rd international conference on Supercomputing
Adaptive line placement with the set balancing cache

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Exploiting memory soft redundancy for joint improvement of error tolerance and access efficiency

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Customized placement for high performance embedded processor caches

ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Using dead blocks as a virtual victim cache

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
An intra-tile cache set balancing scheme

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
STEM: Spatiotemporal Management of Capacity for Intra-core Last Level Caches

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The ZCache: Decoupling Ways and Associativity

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Memory-, bandwidth-, and power-aware multi-core for a graph database workload

ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Research note: C-AMTE: A location mechanism for flexible cache management in chip multiprocessors

Journal of Parallel and Distributed Computing
An energy-efficient adaptive hybrid cache

Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
ASCIB: adaptive selection of cache indexing bits for removing conflict misses

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Base-delta-immediate compression: practical data compression for on-chip caches

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
An energy-efficient L2 cache architecture using way tag information under write-through policy

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
TLC: a tag-less cache for reducing dynamic first level cache energy

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
The reuse cache: downsizing the shared last-level cache

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Simultaneously optimizing DRAM cache hit latency and miss rate via novel set mapping policies

Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

As processor speeds increase and memory latency becomes more critical, intelligent design and management of secondary caches becomes increasingly important. The efficiency of current set-associative caches is reduced because programs exhibit a non-uniform distribution of memory accesses across different cache sets. We propose a technique to vary the associativity of a cache on a per-set basis in response to the demands of the program. By increasing the number of tag-store entries relative to the number of data lines, we achieve the performance benefit of global replacement while maintaining the constant hit latency of a set-associative cache. The proposed variable-way, or V-Way, set-associative cache achieves an average miss rate reduction of 13% on sixteen benchmarks from the SPEC CPU2000 suite. This translates into an average IPC improvement of 8%.