ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Handling long-latency loads in a simultaneous multithreading processor
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Reducing power density through activity migration
Proceedings of the 2003 international symposium on Low power electronics and design
Dynamic Thermal Management for High-Performance Microprocessors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Heat-and-run: leveraging SMT and CMP to manage power density through the operating system
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Performance, Energy, and Thermal Considerations for SMT and CMP Architectures
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Techniques for Multicore Thermal Management: Classification and New Exploration
Proceedings of the 33rd annual international symposium on Computer Architecture
Recent thermal management techniques for microprocessors
ACM Computing Surveys (CSUR)
Hi-index | 0.11 |
Throughput servers using simultaneous multithreaded (SMT) processors are becoming an important paradigm with products such as Sun's Niagara and IBM Power5. Unfortunately, throughput-computing via SMT aggravates power-density problems because SMT increases utilization, decreasing cooling opportunities for overheated resources. Existing power density techniques are: slowing computation and lowering supply voltage, which is likely infeasible in future technologies; stopping computation to reduce heating, which substantially degrades performance; or migrating computation to spare resources, which adds complexity; or requiring underutilized resources, which may not be available in an SMT-based throughput server. An alternative is to increase the area of heat-prone resources at design time. We propose the concept of dilation where a resource's circuit components are spread over an area larger than required for correct logic. Increasing area allows the resource to be utilized more without violating power-density constraints. This paper is the first to consider increasing CPU resource area for improving throughput in a thermally-constrained processor. Dilating area to improve performance seems counterintuitive because it also increases the latency of the components. However this technique is uniquely effective in SMT-based throughput computing because having multiple threads from which to choose instructions makes SMTs more tolerant of added latency than superscalars. We propose two implementations. Our first implementation, Simple Resource Area Dilation (S-RAD), increases the area of heat-prone resources and scales the CPU clock frequency accordingly. Our second implementation, Pipelined Resource Area Dilation (PRAD), pipelines the dilated resources to maintain clock frequency.