Performance boosting under reliability and power constraints

  • Authors:
  • Youngtaek Kim;Lizy Kurian John;Indrani Paul;Srilatha Manne;Michael Schulte

  • Affiliations:
  • The University of Texas at Austin, Austin, TX;The University of Texas at Austin, Austin, TX;AMD Research, Advanced Micro Devices, Inc., Austin, TX;AMD Research, Advanced Micro Devices, Inc., Austin, TX;AMD Research, Advanced Micro Devices, Inc., Austin, TX

  • Venue:
  • Proceedings of the International Conference on Computer-Aided Design
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Voltage droops resulting from inductive noise are common in state-of-the-art processors. Many of the techniques used to reduce energy consumption -- clock gating, power gating, process shrinks, and voltage reduction -- lead to increased voltage droops or increased sensitivity to voltage variations. Designers use voltage guardbands to minimize errors due to voltage fluctuations and inductive noise; however, this leads to lower performance because the voltage and frequency points are set to deal with voltage droops from a worst-case benchmark or stressmark. Although most applications do not approach the voltage droop caused by the stressmark, there is no mechanism to guarantee correct operation outside the tested range. In this paper, we examine floating-point issue throttling (FP throttling), a hardware technique that reduces worst-case voltage droop. By lowering the issue rate in the FP scheduler, the processor can significantly reduce the maximum voltage droop in the system. We show the impact of FP throttling on voltage droop, and translate this reduction in voltage droop to an increase in operating frequency (and hence increased performance) because an additional guardband is no longer required to guard against droops resulting from heavy FP usage. We then examine the impact of FP throttling and guardband reduction on the SPEC CPU2006 benchmarks and show that some benchmarks benefit from the frequency improvements with FP throttling while others suffer due to reduced FP throughput. Finally, we present two techniques to determine dynamically when to trade FP throughput for reduced voltage margin and increased frequency, and show performance improvements of up to 15% for CINT2006 benchmarks and up to 8% for CFP2006 benchmarks. Our studies are done on hardware in which FP units generate the worst-case voltage droop. The technique can be modified for architectures in which other units cause the worst droop.