Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
ACM SIGOPS Operating Systems Review
ESL power analysis of embedded processors for temperature and reliability estimations
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Consistent runtime thermal prediction and control through workload phase detection
Proceedings of the 47th Design Automation Conference
Temperature-aware integrated DVFS and power gating for executing tasks with runtime distribution
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Thermal-aware floorplan schemes for reliable 3D multi-core processors
ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part II
NEMS based thermal management for 3D many-core system
NANOARCH '11 Proceedings of the 2011 IEEE/ACM International Symposium on Nanoscale Architectures
Throughput Maximization for Intel Desktop Platform under the Maximum Temperature Constraint
GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
Recent thermal management techniques for microprocessors
ACM Computing Surveys (CSUR)
Adaptive dynamic frequency scaling for thermal-aware 3d multi-core processors
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
ACM Transactions on Architecture and Code Optimization (TACO)
Thermal prediction and adaptive control through workload phase detection
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on adaptive power management for energy and temperature-aware computing systems
Collaborative calibration of on-chip thermal sensors using performance counters
Proceedings of the International Conference on Computer-Aided Design
Analog Integrated Circuits and Signal Processing
Dynamic Power and Thermal Management of NoC-Based Heterogeneous MPSoCs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 0.03 |
Thermal issues are fast becoming major design constraints in high-performance systems. Temperature variations adversely affect system reliability and prompt worst-case design. In recent history, researchers have proposed dynamic thermal-management (DTM) techniques targeting average-case design and tackling the temperature issue at runtime. While past work on DTM has focused on different techniques in isolation, it fails to consider a system-level approach which uses both hardware and software support in a synergistic fashion and hence leads to a significant execution-time overhead. In this paper, we propose HybDTM, a system-level framework for doing fine-grained coordinated thermal management using a hybrid of hardware techniques (like clock gating) and software techniques (like thermal-aware process scheduling), leveraging the advantages of both approaches in a synergistic fashion. We show that while hardware techniques can be used reactively to manage the overall temperature in case of thermal emergencies, proactive use of software techniques can build on top of it to balance the overall thermal profile with minimal overhead using the operating system (OS) support. In order to evaluate our proposed hybrid-DTM policy, we develop a novel regression-based thermal model, providing fast and accurate temperature estimates to do runtime thermal characterization of all applications running on the system, using hardware performance counters available in modern high-performance processors alongside thermal sensors for training the model at runtime. Our model is validated against actual temperature measurements from online thermal sensors, with the average estimation error found to be less than 5%. We also study system-level DTM issues, jointly considering both the processor and memory, and show how a unified DTM approach can benefit from global knowledge of individual system components. We evaluate our proposed methodology on a desktop system with an Intel Pentium-4 process- - or and a modified Linux OS, running a number of SPEC2000 benchmarks, in both uniprocessor and simultaneous multithreaded environments and show that our proposed technique is able to successfully manage the overall temperature with an average execution-time overhead of only 10.4% (20.1% maximum) compared to the case without any DTM, as opposed to 23.9% (46% maximum) overhead for purely hardware-based DTM. Our system, including the thermal-aware OS, built-in runtime thermal-characterization model, and interface to the underlying hardware using the Pentium-4 processor, is ready for release.