Randomized Cache Placement for Eliminating Conflicts

Authors:
Nigel Topham;Antonio González
Affiliations:
Edinburgh Univ., Edinburgh, Scotland;Univ. of Catalonia, Barcelona, Spain
Venue:
IEEE Transactions on Computers - Special issue on cache memory and related problems
Year:
1999

Citing 26
Cited 18

Vector access performance in parallel memories using skewed storage scheme

IEEE Transactions on Computers
Organization and performance of a two-level virtual-real cache hierarchy

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
On randomly interleaved memories

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A case for two-way skewed-associative caches

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Avoiding conflict misses dynamically in large direct-mapped caches

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Streamlining data cache access with fast address calculation

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Zero-cycle loads: microarchitecture support for reducing load latency

Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References

IEEE Transactions on Computers
The case for a single-chip multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The performance potential of data dependence speculation & collapsing

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Eliminating cache conflict misses through XOR-based placement functions

ICS '97 Proceedings of the 11th international conference on Supercomputing
Speculative execution via address prediction and data prefetching

ICS '97 Proceedings of the 11th international conference on Supercomputing
Cache miss equations: an analytical representation of cache misses

ICS '97 Proceedings of the 11th international conference on Supercomputing
The design and performance of a conflict-avoiding cache

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Cache Memories

ACM Computing Surveys (CSUR)
Analysis of Cache Performance for Operating Systems and Multiprogramming

Analysis of Cache Performance for Operating Systems and Multiprogramming
Skewed-associative Caches

PARLE '93 Proceedings of the 5th International PARLE Conference on Parallel Architectures and Languages Europe
Memory Address Prediction for Data Speculation

Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Advanced performance features of the 64-bit PA-8000

COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

Analytical cache models with applications to cache partitioning

ICS '01 Proceedings of the 15th international conference on Supercomputing
Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses

IEEE Transactions on Computers
Improving cache hit ratio by extended referencing cache lines

Journal of Computing Sciences in Colleges
Highly accurate and efficient evaluation of randomising set index functions

Journal of Systems Architecture: the EUROMICRO Journal
Dynamic Partitioning of Shared Cache Memory

The Journal of Supercomputing
A proposal for input-sensitivity analysis of profile-driven optimizations on embedded applications

MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Skewed caches from a low-power perspective

Proceedings of the 2nd conference on Computing frontiers
XOR-Based Hash Functions

IEEE Transactions on Computers
Reducing cache misses by application-specific re-configurable indexing

Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Optimizing instruction cache performance of embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Application-specific reconfigurable XOR-indexing to eliminate cache conflict misses

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Using Indexing Functions to Reduce Conflict Aliasing in Branch Prediction Tables

IEEE Transactions on Computers
Compiler techniques for reducing data cache miss rate on a multithreaded architecture

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Hardware assistance for trustworthy systems through 3-D integration

Proceedings of the 26th Annual Computer Security Applications Conference
Data layout for cache performance on a multithreaded architecture

Transactions on high-performance embedded architectures and compilers III
Inspection resistant memory: architectural support for security from physical examination

Proceedings of the 39th Annual International Symposium on Computer Architecture
Spatiotemporal Coherence Tracking

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
A cache design for probabilistically analysable real-time systems

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.01

Visualization

Abstract

Applications with regular patterns of memory access can experience high levels of cache conflict misses. In shared-memory multiprocessors conflict misses can be increased significantly by the data transpositions required for parallelization. Techniques such as blocking which are introduced within a single thread to improve locality, can result in yet more conflict misses. The tension between minimizing cache conflicts and the other transformations needed for efficientparallelization leads to complex optimization problems for parallelizing compilers. This paper shows how the introduction of a pseudorandom element into the cache index function can effectively eliminate repetitive conflict misses and produce a cache where miss ratio depends solely on working set behavior. We examine the impact of pseudorandom cache indexing on processor cycle times and present practical solutions to some of the major implementation issues for this type of cache. Our conclusions are supported by simulations of a superscalar out-of-order processor executing the SPEC95 benchmarks, as well as from cache simulations of individual loop kernels to illustrate specific effects. We present measurements of Instructions committed Per Cycle (IPC) when comparing the performance of different cache architectures on whole-program benchmarks such as the SPEC95 suite.