Design Considerations of High Performance Data Cache with Prefetching

  • Authors:
  • Chi-Hung Chi;Jun-Li Yuan

  • Affiliations:
  • -;-

  • Venue:
  • Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a set of four load-balancing techniques to address the memory latency problem of on-chip cache. The first two mechanisms, the sequential unification and the aggressive lookahead mechanisms, are mainly used to reduce the chance of partial hits and the abortion of accurate prefetch requests. The latter two mechanisms, the default prefetching and the cache partitioning mechanisms, are used to optimize the cache performance of the unpredictable references. The resulting cache, called the LBD (Load-Balancing Data) cache, is found to have superior performance over a wide range of applications. Simulation of the LBD cache with RPT prefetching (Reference Prediction Table - one of the most cited selective data prefetch schemes [2,3]) on SPEC95 showed that significant reduction in the data reference latency, ranging from about 20% to over 90% and with an average of 55.89%, can be obtained. This is compared against the performance of prefetch-on-miss and RPT, with an average latency reduction of only 17.37% and 26.05% respectively.