Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses

  • Authors:
  • Rui Min;Yiming Hu

  • Affiliations:
  • Univ. of Cincinnati, Cincinnatii, OH;Univ. of Cincinnati, Cincinnatii, OH

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 2001

Quantified Score

Hi-index 14.98

Visualization

Abstract

Modern CPUs often use large physically indexed caches that are direct-mapped or have low associativities. Such caches do not interact well with virtual memory systems. An improperly placed physical page will end up in a wrong place in the cache, causing excessive conflicts with other cached pages. Page coloring has been proposed to reduce the conflict misses by carefully placing pages in the physical memory. While page coloring works well for some applications, many factors limit its performance. Page coloring limits the freedom of the page placement system and may increase swapping traffic. In this paper, we propose a novel and simple architecture, called color-indexed, physically tagged caches, which can significantly reduce the conflict misses. With some simple modifications to the TLB (Translation Look-aside Buffer), the new architecture decouples the addresses of the cache from the addresses of the main memory. Since the cache addresses do not depend on the the physical memory addresses anymore, the system can freely place data in any cache page to minimize the conflict misses, without affecting the paging system. Extensive trace-driven simulation results show that our design performs much better than traditional page coloring techniques. The new scheme enables a direct-mapped cache to achieve hit ratios very close to or better than those of a two-way set associative cache. Moreover, the architecture does not increase cache access latency, which is a drawback of set associative caches. The hardware overhead is minimal. We show that our scheme can reduce the cache size by 50 percent without sacrificing performance. A two-way set-associative cache that uses this strategy can perform very close to a fully associative cache.