A hierarchical representation policy iteration algorithm for reinforcement learning

  • Authors:
  • Jian Wang;Lei Zuo;Jian Wang;Xin Xu;Chun Li

  • Affiliations:
  • College of Mechatronics and Automation, National University of Defense Tech., Changsha, P.R. China;College of Mechatronics and Automation, National University of Defense Tech., Changsha, P.R. China;Xi'an Air Force Military Representative Office, China;College of Mechatronics and Automation, National University of Defense Tech., Changsha, P.R. China;College of Mechatronics and Automation, National University of Defense Tech., Changsha, P.R. China

  • Venue:
  • IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a hierarchical representation policy iteration (HRPI) algorithm. It is based on the method of state space decomposition implemented by introducing a binary tree. Combining the RPI algorithm with the state space decomposition method, the HRPI algorithm is proposed. In HRPI, the state space is decomposed into multiple sub-spaces according to an approximate value function, then the local policies are estimated on each sub-space and finally the global near-optimal policy is obtained by combining these local policies. The simulation results indicate that the proposed method has better performance compared to the conventional RPI algorithm.