Memory performance prediction for high-performance microprocessors at deep submicrometer technologies

  • Authors:
  • A. Zeng;K. Rose;R. J. Gutmann

  • Affiliations:
  • Dept. of Electr. Eng., Rensselaer Polytech. Inst., Troy, NY;-;-

  • Venue:
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

The desire for large size, high-speed, and low-power on-chip memory necessitates early and accurate estimates of memory performance. A new performance model as well as an early cache design tool and predictor of access and cycle time for cache stack (PRACTICS) has been developed for on-chip static random access memory (SRAM) cache design that includes both delay and dynamic-power models. Efficient models for distributed interconnect delays, verified by Cadence simulations, are introduced, and their necessity is demonstrated. In the delay model, the access time is estimated by decomposing each component into several equivalent lumped resistance-capacitance (RC) circuits and using an appropriate order pi model to approximate the distributed wire delays of each stage. The dynamic-power model calculates the charging power dissipation of the load capacitances using the same equivalent lumped RC circuits. The delay model has been validated with an Intel 18-Mb SRAM at the 180-nm node, achieving accuracy to within 10% of the measured results. The dynamic-power model has been validated with an International Business Machines Corporation (IBM) 18-Mb SRAM at the 180-nm node, to within 13% of the measured power consumption. Detailed comparisons between PRACTICS and cache access and cycle time model (CACTI) in both validation cases indicate that an improved wire delay, appropriate circuit structures, and technology dependent parameters are necessary to accurately predict large cache memory performance at deep submicrometer technology nodes. PRACTICS is used to analyze the access time and power consumption in terms of cache sizes and various degrees of associativity for architectural studies. In addition, the PRACTICS simulation results show that repeater insertion reduces the access time significantly, with a small overhead in dynamic-power consumption for large size cache design at deep submicrometer technology