Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification

Authors:
Yanzhi Wang;Qing Xie;Ahmed Ammari;Massoud Pedram
Affiliations:
University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA;National Institute of the Applied Sciences and of the Technology (INSAT),Tunis cedex, Tunisia;University of Southern California, Los Angeles, CA
Venue:
Proceedings of the 48th Design Automation Conference
Year:
2011

Citing 11
Cited 5

Predictive system shutdown and other architectural techniques for energy efficient programmable computation

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A predictive system shutdown method for energy saving of event-driven computation

ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Dynamic power management based on continuous-time Markov decision processes

Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A survey of design techniques for system-level dynamic power management

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low-power electronics and design
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Dynamic power management using machine learning

Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Stochastic modeling and optimization for robust power management in a partially observable system

Proceedings of the conference on Design, automation and test in Europe
Dynamic power management under uncertain information

Proceedings of the conference on Design, automation and test in Europe
Adaptive power management using reinforcement learning

Proceedings of the 2009 International Conference on Computer-Aided Design
Policy optimization for dynamic power management

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Optimal DPM and DVFS for frame-based real-time systems

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
MELOADES: Methodology for long-term online adaptation of embedded software for heterogeneous devices

Journal of Systems Architecture: the EUROMICRO Journal
Online learning of timeout policies for dynamic power management

ACM Transactions on Embedded Computing Systems (TECS)
Unified reliability estimation and management of NoC based chip multiprocessors

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

To cope with the variations and uncertainties that emanate from hardware and application characteristics, dynamic power management (DPM) frameworks must be able to learn about the system inputs and environment and adjust the power management policy on the fly. In this paper we present an online adaptive DPM technique based on model-free reinforcement learning (RL), which is commonly used to control stochastic dynamical systems. In particular, we employ temporal difference learning for semi-Markov decision process (SMDP) for the model-free RL. In addition a novel workload predictor based on an online Bayes classifier is presented to provide effective estimates of the workload states for the RL algorithm. In this DPM framework, power and latency tradeoffs can be precisely controlled based on a user-defined parameter. Experiments show that amount of average power saving (without any increase in the latency) is up to 16.7% compared to a reference expert-based approach. Alternatively, the per-request latency reduction without any power consumption increase is up to 28.6% compared to the expert-based approach.