HybDTM: a coordinated hardware-software approach for dynamic thermal management

  • Authors:
  • Amit Kumar;Li Shang;Li-Shiuan Peh;Niraj K. Jha

  • Affiliations:
  • Princeton University, Princeton, NJ;Queen's University, Kingston, ON, Canada;Princeton University, Princeton, NJ;Princeton University, Princeton, NJ

  • Venue:
  • Proceedings of the 43rd annual Design Automation Conference
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

With ever-increasing power density and cooling costs in modern high-performance systems, dynamic thermal management (DTM) has emerged as an effective technique for guaranteeing thermal safety at run-time. While past works on DTM have focused on different techniques in isolation, they fail to consider a synergistic mechanism using both hardware and software support and hence lead to a significant execution time overhead.In this paper, we propose HybDTM, a methodology for fine-grained, coordinated thermal management using a hybrid of hardware techniques, such as clock gating, and software techniques, such as thermal-aware process scheduling, synergistically leveraging the advantages of both approaches. We show that while hardware techniques can be used reactively to manage thermal emergencies, proactive use of low-overhead software techniques can rely on application-specific thermal profiles to lower system temperature. Our technique involves a novel regression-based thermal model which provides fast and accurate temperature estimates for run-time thermal characterization of applications running on the system, using hardware performance counters, while considering system-level thermal issues. We evaluate HybDTM on an actual desktop system running a number of SPEC2000 benchmarks, in both uniprocessor and simultaneous multi-threading (SMT) environments, and show that it is able to successfully manage the overall temperature with an average execution time overhead of only 9.9% (16.3% maximum) compared to the case without any DTM, as opposed to 20.4% (29.5% maximum) overhead for purely hardware-based DTM.