Reducing data cache energy consumption via cached load/store queue

  • Authors:
  • Dan Nicolaescu;Alex Veidenbaum;Alex Nicolau

  • Affiliations:
  • University of California, Irvine, CA;University of California, Irvine, CA;University of California, Irvine, CA

  • Venue:
  • Proceedings of the 2003 international symposium on Low power electronics and design
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-performance processors use a large set--associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of the total processor energy. This paper proposes a method of saving energy by reducing the number of data cache accesses. It does so by modifying the Load/Store Queue design to allow "caching" of previously accessed data values on both loads and stores after the corresponding memory access instruction has been committed. It is shown that a 32-entry modified LSQ design allows an average of 38.5% of the loads in the SpecINT95 benchmarks and 18.9% in the SpecFP95 benchmarks to get their data from the LSQ. The reduction in the number of L1 cache accesses results in up to a 40% reduction in the L1 data cache energy consumption and in an up to a 16% improvement in the energy--delay product while requiring almost no additional hardware or complex control logic.