Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
A case for two-way skewed-associative caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Capturing dynamic memory reference behavior with adaptive cache topology
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache conscious programming in undergraduate computer science
SIGCSE '99 The proceedings of the thirtieth SIGCSE technical symposium on Computer science education
Sequentiality and prefetching in database systems
ACM Transactions on Database Systems (TODS)
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Predictive sequential associative cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Analysis of cache replacement-algorithms
Analysis of cache replacement-algorithms
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Using Prime Numbers for Cache Indexing to Eliminate Conflict Misses
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
A Case for MLP-Aware Cache Replacement
Proceedings of the 33rd annual international symposium on Computer Architecture
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Architectural support for operating system-driven CMP cache management
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Two-level mapping based cache index selection for packet forwarding engines
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Investigating cache energy and latency break-even points in high performance processors
MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Unified microprocessor core storage
Proceedings of the 4th international conference on Computing frontiers
Adaptive insertion policies for high performance caching
Proceedings of the 34th annual international symposium on Computer architecture
Cooperative cache partitioning for chip multiprocessors
Proceedings of the 21st annual international conference on Supercomputing
Investigating cache energy and latency break-even points in high performance processors
ACM SIGARCH Computer Architecture News
YAARC: yet another approach to further reducing the rate of conflict misses
The Journal of Supercomputing
Less reused filter: improving l2 cache performance via filtering less reused lines
Proceedings of the 23rd international conference on Supercomputing
Adaptive line placement with the set balancing cache
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Exploiting memory soft redundancy for joint improvement of error tolerance and access efficiency
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Customized placement for high performance embedded processor caches
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Using dead blocks as a virtual victim cache
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
An intra-tile cache set balancing scheme
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
STEM: Spatiotemporal Management of Capacity for Intra-core Last Level Caches
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The ZCache: Decoupling Ways and Associativity
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Memory-, bandwidth-, and power-aware multi-core for a graph database workload
ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Research note: C-AMTE: A location mechanism for flexible cache management in chip multiprocessors
Journal of Parallel and Distributed Computing
An energy-efficient adaptive hybrid cache
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
ASCIB: adaptive selection of cache indexing bits for removing conflict misses
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Base-delta-immediate compression: practical data compression for on-chip caches
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
An energy-efficient L2 cache architecture using way tag information under write-through policy
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
TLC: a tag-less cache for reducing dynamic first level cache energy
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
The reuse cache: downsizing the shared last-level cache
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Simultaneously optimizing DRAM cache hit latency and miss rate via novel set mapping policies
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
As processor speeds increase and memory latency becomes more critical, intelligent design and management of secondary caches becomes increasingly important. The efficiency of current set-associative caches is reduced because programs exhibit a non-uniform distribution of memory accesses across different cache sets. We propose a technique to vary the associativity of a cache on a per-set basis in response to the demands of the program. By increasing the number of tag-store entries relative to the number of data lines, we achieve the performance benefit of global replacement while maintaining the constant hit latency of a set-associative cache. The proposed variable-way, or V-Way, set-associative cache achieves an average miss rate reduction of 13% on sixteen benchmarks from the SPEC CPU2000 suite. This translates into an average IPC improvement of 8%.