A comparative analysis of performance improvement schemes for cache memories

  • Authors:
  • Krishna Kavi;Izuchukwu Nwachukwu;Ademola Fawibe

  • Affiliations:
  • The University of North Texas, Denton, TX 76203, USA;The University of North Texas, Denton, TX 76203, USA;The University of North Texas, Denton, TX 76203, USA

  • Venue:
  • Computers and Electrical Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

There have been numerous techniques proposed in the literature that aim to improve the performance of cache memories by reducing cache conflicts. These techniques were proposed over the past decade and each proposal independently claimed to reduce conflict misses. However, because the published results used different benchmarks and different experimental setups, it is not easy to compare them. In this paper we report a side-by-side comparison of these techniques. We also evaluate the suitability of some of these techniques for caches with higher set associativities. In addition to evaluating techniques for their impact on cache misses and average memory access times, we also evaluate the techniques for their ability in reducing the non-uniformity of cache accesses. The conclusion of our work is that, each application may benefit from a different technique and no single scheme works universally well for all applications. We also observe that, for the majority of applications, XORing (XOR) and Odd-multiplier indexing schemes perform reasonably well. Among programmable associativity techniques, B-cache performs better than column-associative and adaptive-caches, but column-associative caches require very minimal extensions to hardware. Uniformity of cache accesses is improved most by B-cache technique while column-associative cache also improves cache access uniformities. Based on the observation that different techniques benefit different applications, we explored the use of multiple, programmable addressing mechanisms, each addressing scheme designed for a specific application. We include some preliminary data using multiple addressing schemes.