R(λ) imitation learning for automatic generation control of interconnected power grids

Authors:
T. Yu;B. Zhou;K. W. Chan;Y. Yuan;B. Yang;Q. H. Wu
Affiliations:
College of Electric Power, South China University of Technology, Guangzhou, 510640, China;Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region;Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region;Qingdao Power Supply Company, State Grid Corporation of China, Qingdao, 266000, China;College of Electric Power, South China University of Technology, Guangzhou, 510640, China;Department of Electrical Engineering and Electronics, The University of Liverpool, Liverpool, L69 3GJ, UK
Venue:
Automatica (Journal of IFAC)
Year:
2012

Citing 4
Cited 0

Average reward reinforcement learning: foundations, algorithms, and empirical results

Machine Learning - Special issue on reinforcement learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Paper: Human dynamics in man-machine systems

Automatica (Journal of IFAC)

Quantified Score

Hi-index	22.14

Visualization

Abstract

The goal of average reward reinforcement learning is to maximize the long-term average rewards of a generic system. This coincides with the design objective of the control performance standards (CPS) which were established to improve the long-term performance of an automatic generation controller (AGC) used for real-time control of interconnected power systems. In this paper, a novel R(@l) imitation learning (R(@l)IL) method based on the average reward optimality criterion is presented to develop an optimal AGC under the CPS. This R(@l)IL-based AGC can operate online in real-time with high CPS compliances and fast convergence rate in the imitation pre-learning process. Its capability to learn the control behaviors of the existing AGC by observing system variations enable it to overcome the serious defect in the applicability of conventional RL controllers, in which an accurate power system model is required for the offline pre-learning process, and significantly enhance the learning efficiency and control performance for power generation control in various power system operation scenarios.