A hybrid evolving and gradient strategy for approximating policy evaluation on online critic-actor learning

  • Authors:
  • Jian Fu;Haibo He;Huiying Li;Qing Liu

  • Affiliations:
  • School of Automation, Wuhan University of Technology, Wuhan, Hubei, China;Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI;School of Automation, Wuhan University of Technology, Wuhan, Hubei, China;School of Automation, Wuhan University of Technology, Wuhan, Hubei, China

  • Venue:
  • ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a novel strategy for approximating policy evaluation during online critic-actor learning procedure. We adopt the adaptive differential evolution with elites (ADEE) to optimize moving least square temporal difference with one step (MLSTD(0)) at the early stage which is good at global searching. Next we apply gradient method to perform local search efficiently and effectively. That solves the dilemma between explore and exploit in weight seeking for critic neural network. Simulation results on the online learning control of a cart pole benchmark demonstrate the efficiency of the presented method.