Embedded cache architecture with programmable write buffer support for power and performance flexibility

  • Authors:
  • Afzal Malik;Bill Moyer;Roger Zhou

  • Affiliations:
  • Motorola Inc., Austin, TX;Motorola Inc., Austin, TX;Motorola Inc., Austin, TX

  • Venue:
  • CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Next generation portable devices are placing stringent requirements on overall system power and performance. Voice recognition, streaming video and high speed wire驴less internet access are just some of the features being incorporated in these handheld electronic gadgets. The M驴CORE M341-S processor has been designed for high performance and cost sensitive portable products as well as for high end embedded control applications. M341-S obtains increased performance over the M驴CORE M2 and M310 families by integrating unified 16KB cache, and additional instruction pipelining and buffering to increase the operating frequency. An 8-entry programmable write buffer which can defer pending write misses and writethrough accesses is used in order to maximize perfor驴mance. In this paper, we discuss the enhanced cache archi驴tecture and the flexible priority scheme for controlling the write buffer. We use a hardware technique which provides a flexible mechanism to control emptying and flushing of write buffer based on a set of configurable thresholds, as well as a mechanism to alter the priorities from the write buffer to the main memory system. The same unified mechanism is used to support flushing as well as providing a solution for the read after write (RAW) hazard avoid驴ance. We present the enhancements made to the M3 core and discuss the effect on power and performance through benchmark analysis and actual silicon measurements.