Using Prime Numbers for Cache Indexing to Eliminate Conflict Misses

Authors:
Mazen Kharbutli;Keith Irwin;Yan Solihin;Jaejin Lee
Affiliations:
North Carolina State University;North Carolina State University;North Carolina State University;Seoul National University
Venue:
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Year:
2004

Citing 0
Cited 26

Locality-Aware Process Scheduling for Embedded MPSoCs

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Eliminating Conflict Misses Using Prime Number-Based Cache Indexing

IEEE Transactions on Computers
Predicting Cache Space Contention in Utility Computing Servers

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Skewed caches from a low-power perspective

Proceedings of the 2nd conference on Computing frontiers
A case for a working-set-based memory hierarchy

Proceedings of the 2nd conference on Computing frontiers
The V-Way Cache: Demand Based Associativity via Global Replacement

Proceedings of the 32nd annual international symposium on Computer Architecture
XOR-Based Hash Functions

IEEE Transactions on Computers
Balanced Cache: Reducing Conflict Misses of Direct-Mapped Caches

Proceedings of the 33rd annual international symposium on Computer Architecture
An analytical model for cache replacement policy performance

SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Two-level mapping based cache index selection for packet forwarding engines

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Using Indexing Functions to Reduce Conflict Aliasing in Branch Prediction Tables

IEEE Transactions on Computers
Heterogeneous way-size cache

Proceedings of the 20th annual international conference on Supercomputing
Reducing cache misses through programmable decoders

ACM Transactions on Architecture and Code Optimization (TACO)
YAARC: yet another approach to further reducing the rate of conflict misses

The Journal of Supercomputing
Design of new XOR-based hash functions for cache memories

Computers & Mathematics with Applications
Counting Dependence Predictors

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Notary: Hardware techniques to enhance signatures

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Adaptive line placement with the set balancing cache

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Entropy representation of memory access characteristics and cache performance

ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
A new TCB cache to efficiently manage TCP sessions for web servers

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Efficient address mapping of shared cache for on-chip many-core architecture

EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
STEM: Spatiotemporal Management of Capacity for Intra-core Last Level Caches

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The ZCache: Decoupling Ways and Associativity

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Dynamic co-allocation of level one caches

ICESS'05 Proceedings of the Second international conference on Embedded Software and Systems
A comparative analysis of performance improvement schemes for cache memories

Computers and Electrical Engineering
ASCIB: adaptive selection of cache indexing bits for removing conflict misses

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design

Quantified Score

Hi-index	0.01

Visualization

Abstract

Using alternative cache indexing/hashing functions is a popular technique to reduce conflict misses by achieving a more uniform cache access distribution across the sets in the cache. Although various alternative hashing functions have been demonstrated to eliminate the worst case conflict behavior, no study has really analyzed the pathological behavior of such hashing functions that often result in performance slowdown. In this paper, we present an in-depth analysis of the pathological behavior of cache hashing functions. Based on the analysis, we propose two new hashing functions: prime modulo and prime displacement that are resistant to pathological behavior and yet are able to eliminate the worst case conflict behavior in the L2 cache. We show that these two schemes can be implemented in fast hardware using a set of narrow add operations, with negligible fragmentation in the L2 cache. We evaluate the schemes on 23 memory intensive applications. For applications that have non-uniform cache accesses, both prime modulo and prime displacement hashing achieve an average speedup of 1.27 compared to traditional hashing, without slowing down any of the 23 benchmarks. We also evaluate using multiple prime displacement hashing functions in conjunction with a skewed associative L2 cache. The skewed associative cache achieves a better average speedup at the cost of some pathological behavior that slows down four applications by up to 7%.