Closed-loop modeling of power and temperature profiles of FPGAs

  • Authors:
  • Kanupriya Gulati;Sunil P. Khatri;Peng Li

  • Affiliations:
  • Texas A&M University, College Station, TX, USA;Texas A&M University, College Station, TX, USA;Texas A&M University, College Station, TX, USA

  • Venue:
  • Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent times, the contribution of leakage power to the total power consumption of a chip has been increasing at an alarming rate. Leakage power is expected to exceed dynamic power in newer process technologies. Since leakage exhibits an exponential increase with temperature, it is possible that the high leakage of an IC causes a temperature increase, which in turn causes an increase in leakage, and so on, until the IC fails due to overheating. At the very least, this may cause the temperature and power consumption of the IC to be poorly estimated by traditional thermal or power modeling techniques. We developed a framework to model this situation in an FPGA context. Our CAD framework accurately models the total power consumption of the design at a given temperature, finds the thermal profile of the IC under this power consumption, and then uses this new thermal information to update the power consumption. This is iterated until the temperature of the IC converges, or until the temperatures on the die exceed a safe value. The iterations are very fast, due to the use of accurate and compact mathematical macromodels for leakage and temperature computation in the inner loop. We have exhaustively verified the fidelity of all our leakage macromodels. They estimate the leakage, at any temperature, to within 3% of the values generated by SPICE, while providing greater than four orders of magnitude speedup over explicit SPICE runs. Our experiments show that this model helps avoid an incorrect estimation of chip temperature and total power consumption, and also helps detect the increase in device temperature beyond a safe value. The average (maximum) error of our temperature estimates has been found to be within 1% (2.5%) compared to a full-chip 3D temperature modeling tool.