Heat-and-run: leveraging SMT and CMP to manage power density through the operating system

Authors:
Mohamed Gomaa;Michael D. Powell;T. N. Vijaykumar
Affiliations:
Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN
Venue:
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Year:
2004

Citing 11
Cited 91

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
New microarchitecture challenges in the coming generations of CMOS process technologies (keynote address)(abstract only)

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Handling long-latency loads in a simultaneous multithreading processor

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Temperature-aware microarchitecture

Proceedings of the 30th annual international symposium on Computer architecture
Reducing power density through activity migration

Proceedings of the 2003 international symposium on Low power electronics and design
Dynamic Thermal Management for High-Performance Microprocessors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture

Balancing Resource Utilization to Mitigate Power Density in Processor Pipelines

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Performance implications of single thread migration on a chip multi-core

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Techniques for Multicore Thermal Management: Classification and New Exploration

Proceedings of the 33rd annual international symposium on Computer Architecture
Hardware support for spin management in overcommitted virtual machines

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Synergistic temperature and energy management in GALS processor architectures

Proceedings of the 2006 international symposium on Low power electronics and design
Power efficiency for variation-tolerant multicore processors

Proceedings of the 2006 international symposium on Low power electronics and design
Mercury and freon: temperature emulation and management for server systems

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Balancing power consumption in multiprocessor systems

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Physical aware frequency selection for dynamic thermal management in multi-core systems

Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Limiting the power consumption of main memory

Proceedings of the 34th annual international symposium on Computer architecture
A study of thread migration in temperature-constrained multicores

ACM Transactions on Architecture and Code Optimization (TACO)
Efficient power modeling and software thermal sensing for runtime temperature monitoring

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Temperature aware task scheduling in MPSoCs

Proceedings of the conference on Design, automation and test in Europe
Evaluating design tradeoffs in on-chip power management for CMPs

ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Resource area dilation to reduce power density in throughput servers

ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Adapting to intermittent faults in multicore systems

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Task activity vectors: a new metric for temperature-aware scheduling

Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Temperature-aware MPSoC scheduling for reducing hot spots and gradients

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Exploring power management in multi-core systems

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Efficient operating system scheduling for performance-asymmetric multi-core architectures

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Energy management for hypervisor-based virtual machines

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Addressing thermal nonuniformity in SMT workloads

ACM Transactions on Architecture and Code Optimization (TACO)
Predictive dynamic thermal management for multicore systems

Proceedings of the 45th annual Design Automation Conference
Temperature management in multiprocessor SoCs using online learning

Proceedings of the 45th annual Design Automation Conference
Proactive temperature management in MPSoCs

Proceedings of the 13th international symposium on Low power electronics and design
Analytical results for design space exploration of multi-core processors employing thread migration

Proceedings of the 13th international symposium on Low power electronics and design
Thermal monitoring mechanisms for chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
Algorithms for Temperature-Aware Task Scheduling in Microprocessor Systems

AAIM '08 Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management
Performance Implications of Cache Affinity on Multicore Processors

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Multi-optimization power management for chip multiprocessors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Dynamic power management framework for multi-core portable embedded system

IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Predictive Thermal Management for Chip Multiprocessors Using Co-designed Virtual Machines

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Proactive temperature balancing for low cost thermal management in MPSoCs

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Thermal-aware floorplanning for task migration enabled active sub-threshold leakage reduction

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Temperature aware task sequencing and voltage scaling

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Static and dynamic temperature-aware scheduling for multiprocessor SoCs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamic heterogeneity and the need for multicore virtualization

ACM SIGOPS Operating Systems Review
Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Thread motion: fine-grained power management for multi-core systems

Proceedings of the 36th annual international symposium on Computer architecture
Adapting application execution in CMPs using helper threads

Journal of Parallel and Distributed Computing
Predict and act: dynamic thermal management for multi-core processors

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Dynamic thermal management using thin-film thermoelectric cooling

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Spectral techniques for high-resolution thermal characterization with limited sensor data

Proceedings of the 46th Annual Design Automation Conference
TAPE: thermal-aware agent-based power economy for multi/many-core architectures

Proceedings of the 2009 International Conference on Computer-Aided Design
A hybrid local-global approach for multi-core thermal management

Proceedings of the 2009 International Conference on Computer-Aided Design
A cost-effective load-balancing policy for tile-based, massive multi-core packet processors

ACM Transactions on Embedded Computing Systems (TECS)
Utilizing predictors for efficient thermal management in multiprocessor SoCs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fast and accurate prediction of the steady-state throughput of multicore processors under thermal constraints

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Performance-aware thermal management via task scheduling

ACM Transactions on Architecture and Code Optimization (TACO)
On temperature-aware scheduling for single-processor systems

HiPC'07 Proceedings of the 14th international conference on High performance computing
Proposition for a sequential accelerator in future general-purpose manycore processors and the problem of migration-induced cache misses

Proceedings of the 7th ACM international conference on Computing frontiers
Dynamic thermal management for networked embedded systems under harsh ambient temperature variation

Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Handling the problems and opportunities posed by multiple on-chip memory controllers

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Temperature-aware scheduler based on thermal behavior grouping in multicore systems

Proceedings of the Conference on Design, Automation and Test in Europe
Hardware/software co-design architecture for thermal management of chip multiprocessors

Proceedings of the Conference on Design, Automation and Test in Europe
Dynamic thermal management in 3D multicore architectures

Proceedings of the Conference on Design, Automation and Test in Europe
A case for lifetime-aware task mapping in embedded chip multiprocessors

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Temperature-aware task scheduling algorithm for soft real-time multi-core systems

Journal of Systems and Software
CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hybrid dynamic energy and thermal management in heterogeneous embedded multiprocessor SoCs

Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Exploring the effects of on-chip thermal variation on high-performance multicore architectures

ACM Transactions on Architecture and Code Optimization (TACO)
Task Allocation and Migration Algorithm for Temperature-Constrained Real-Time Multi-Core Systems

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Thread shuffling: combining DVFS and thread migration toreduce energy consumptions for multi-core systems

Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Dimetrodon: processor-level preventive thermal management via idle cycle injection

Proceedings of the 48th Design Automation Conference
Token3D: reducing temperature in 3d die-stacked CMPs through cycle-level power control mechanisms

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Economic learning for thermal-aware power budgeting in many-core architectures

CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Predictive Model-Based Thermal Management for Network Applications

Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Microvisor: a runtime architecture for thermal management in chip multiprocessors

Transactions on High-Performance Embedded Architectures and Compilers IV
A novel software solution for localized thermal problems

ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Maestro: orchestrating lifetime reliability in chip multiprocessors

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Recent thermal management techniques for microprocessors

ACM Computing Surveys (CSUR)
Reliability-aware platform optimization for 3D chip multi-processors

The Journal of Supercomputing
Thermal-aware real-time task scheduling for three-dimensional multicore chip

Proceedings of the 27th Annual ACM Symposium on Applied Computing
PEPON: performance-aware hierarchical power budgeting for NoC based multicores

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
COOL: control-based optimization of load-balancing for thermal behavior

Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Fine-grained hardware/software methodology for process migration in MPSoCs

Proceedings of the International Conference on Computer-Aided Design
Evaluation of Low-Power Computing when Operating on Subsets of Multicore Processors

Journal of Signal Processing Systems
Exploiting thermal coupling information in MPSoC dynamic thermal management

ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Cooperative boosting: needy versus greedy power management

Proceedings of the 40th Annual International Symposium on Computer Architecture
The autonomic operating system research project: achievements and future directions

Proceedings of the 50th Annual Design Automation Conference
Neighbor-aware dynamic thermal management for multi-core platform

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
TempoMP: integrated prediction and management of temperature in heterogeneous MPSoCs

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
ThermOS: system support for dynamic thermal management of chip multi-processors

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Throughput maximization for periodic real-time systems under the maximal temperature constraint

ACM Transactions on Embedded Computing Systems (TECS) - Special Section ESFH'12, ESTIMedia'11 and Regular Papers
Run-time adaption for highly-complex multi-core systems

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
StreaMorph: a case for synthesizing energy-efficient adaptive programs using high-level abstractions

Proceedings of the Eleventh ACM International Conference on Embedded Software
Real-time worst-case temperature analysis with temperature-dependent parameters

Real-Time Systems
Formal verification of distributed dynamic thermal management

Proceedings of the International Conference on Computer-Aided Design
Dynamic Power and Thermal Management of NoC-Based Heterogeneous MPSoCs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Power density in high-performance processors continues to increase with technology generations as scaling of current, clock speed, and device density outpaces the downscaling of supply voltage and thermal ability of packages to dissipate heat. Power density is characterized by localized chip hot spots that can reach critical temperatures and cause failure. Previous architectural approaches to power density have used global clock gating, fetch toggling, dynamic frequency scaling, or resource duplication to either prevent heating or relieve overheated resources in a superscalar processor. Previous approaches also evaluate design technologies where power density is not a major problem and most applications do not overheat the processor. Future processors, however, are likely to be chip multiprocessors (CMPs) with simultaneously-multithreaded (SMT) cores. SMT CMPs pose unique challenges and opportunities for power density. SMT and CMP increase throughput and thus on-chip heat, but also provide natural granularities for managing power-density. This paper is the first work to leverage SMT and CMP to address power density. We propose heat-and-run SMT thread assignment to increase processor-resource utilization before cooling becomes necessary by co-scheduling threads that use complimentary resources. We propose heat-and-run CMP thread migration to migrate threads away from overheated cores and assign them to free SMT contexts on alternate cores, leveraging availability of SMT contexts on alternate CMP cores to maintain throughput while allowing overheated cores to cool. We show that our proposal has an average of 9% and up to 34% higher throughput than a previous superscalar technique running the same number of threads.