Automatic and efficient evaluation of memory hierarchies for embedded systems

  • Authors:
  • Santosh G. Abraham;Scott A. Mahlke

  • Affiliations:
  • Hewlett-Packard Laboratories, Palo Alto, CA;Hewlett-Packard Laboratories, Palo Alto, CA

  • Venue:
  • Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automation is the key to the design of future embedded systems as it permits application-specific customization while keeping design costs low. A key problem faced by automatic design systems is evaluating the performance of the vast number of alternative designs in a timely manner. For this paper, we focus on an embedded system consisting of the following components: a VLIW processor, instruction cache, data cache, and second-level unified cache. A hierarchical approach of partitioning the system into its constituent components and evaluating each component individually is utilized. The performance of each processor is evaluated independent of its memory hierarchy, and each of the caches is simulated using the traces from a single reference processor. Since the changes in the processor architecture do indeed affect the address traces and thus the performance of the memory hierarchy, the overall performance is inaccurate. To overcome this error, the changes in the processor architecture are modeled as a dilation of the reference processor's address trace, where each instruction block in the trace is conceptually stretched out by the dilation coefficient. This approach provides a projected cache performance that more accurately accounts for changes in the processor architecture. In order to understand the accuracy of the dilation model, we separate the possible errors that the model introduces and quantify these errors on a set of benchmarks. The results show the dilation model is effective for most of the design space and facilitates efficient automatic design.