Coordinated energy management in heterogeneous processors

Authors:
Indrani Paul;Vignesh Ravi;Srilatha Manne;Manish Arora;Sudhakar Yalamanchili
Affiliations:
Advanced Micro Devices, Inc. and Georgia Institute of Technology;Advanced Micro Devices, Inc.;Advanced Micro Devices, Inc.;Advanced Micro Devices, Inc. and University of California, San Diego;Georgia Institute of Technology
Venue:
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Year:
2013

Citing 26
Cited 0

A control-theoretic approach to dynamic voltage scheduling

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Formal online methods for voltage/frequency control in multiple clock domain microprocessors

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Runtime identification of microprocessor energy saving opportunities

ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Dynamic-Compiler-Driven Control for Microprocessor Energy and Performance

IEEE Micro
Bounding energy consumption in large-scale MPI programs

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Prediction models for multi-dimensional power-performance optimization on many cores

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Computer Architecture Techniques for Power-Efficiency

Computer Architecture Techniques for Power-Efficiency
Adagio: making DVS practical for complex HPC applications

Proceedings of the 23rd international conference on Supercomputing
Optimizing throughput of power- and thermal-constrained multicore processors using DVFS and per-core power-gating

Proceedings of the 46th Annual Design Automation Conference
Rodinia: A benchmark suite for heterogeneous computing

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
The Scalable Heterogeneous Computing (SHOC) benchmark suite

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations

Proceedings of the 24th ACM International Conference on Supercomputing
An integrated GPU power and performance model

Proceedings of the 37th annual international symposium on Computer architecture
A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
Improving Throughput of Power-Constrained GPUs Using Dynamic Voltage/Frequency and Core Scaling

PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Porting irregular reductions on heterogeneous CPU-GPU configurations

HIPC '11 Proceedings of the 2011 18th International Conference on High Performance Computing
A dynamic scheduling framework for emerging heterogeneous systems

HIPC '11 Proceedings of the 2011 18th International Conference on High Performance Computing
TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture

HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Practical performance prediction under Dynamic Voltage Frequency Scaling

IGCC '11 Proceedings of the 2011 International Green Computing Conference and Workshops
Stargazer: Automated regression-based GPU design space exploration

ISPASS '12 Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software
Energy based performance tuning for large scale high performance computing systems

Proceedings of the 2012 Symposium on High Performance Computing
Workload and power budget partitioning for single-chip heterogeneous processors

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Accelerating MapReduce on a coupled CPU-GPU architecture

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Redefining the Role of the CPU in the Era of CPU-GPU Integration

IEEE Micro
Poster: An Exascale Workload Study

SCC '12 Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis
Cooperative boosting: needy versus greedy power management

Proceedings of the 40th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper examines energy management in a heterogeneous processor consisting of an integrated CPU-GPU for high-performance computing (HPC) applications. Energy management for HPC applications is challenged by their uncompromising performance requirements and complicated by the need for coordinating energy management across distinct core types -- a new and less understood problem. We examine the intra-node CPU-GPU frequency sensitivity of HPC applications on tightly coupled CPU-GPU architectures as the first step in understanding power and performance optimization for a heterogeneous multi-node HPC system. The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU-GPU architectures. We implement DynaCo on a modern heterogeneous processor and compare its performance to a state-of-the-art power- and performance-management algorithm. DynaCo improves measured average energy-delay squared (ED^2) product by up to 30% with less than 2% average performance loss across several exascale and other HPC workloads.