Prefetching for improved bus wrapper performance in cores

  • Authors:
  • Roman Lysecky;Frank Vahid

  • Affiliations:
  • University of California, Riverside, CA;University of California, Riverside, and University of California, Irvine, CA

  • Venue:
  • ACM Transactions on Design Automation of Electronic Systems (TODAES)
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reuse of cores can reduce design time for systems-on-a-chip. Such reuse is dependent on being able to easily interface a core to any bus. To enable such interfacing, many propose separating a core's interface from its internals by using a bus wrapper. However, this separation can lead to a performance penalty when reading a core's internal registers. In this paper, we introduce prefetching, which is analogous to caching, as a technique to reduce or eliminate this performance penalty, involving a tradeoff with power and size. We describe the prefetching technique, classify different types of registers, describe our initial prefetching architectures and heuristics for certain classes of registers, and highlight experiments demonstrating the performance improvements and size/power tradeoffs. We further introduce a technique for automatically designing a prefetch unit that satisfies user-imposed register-access constraints. The technique benefits from mapping the prefetching problem to the well-known real-time process scheduling problem. We then extend the technique to allow user-specified register interdependencies, using a Petri net model, resulting in even more efficient prefetch schedules.