Adaptive dual heuristic programming based on delta-bar-delta learning rule

Authors:
Jun Wu;Xin Xu;Chuanqiang Lian;Yan Huang
Affiliations:
Institute of Automation, College of Mechatronics and Automation, National University of Changsha, China;Institute of Automation, College of Mechatronics and Automation, National University of Changsha, China;Institute of Automation, College of Mechatronics and Automation, National University of Changsha, China;Institute of Automation, College of Mechatronics and Automation, National University of Changsha, China
Venue:
ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Year:
2011

Citing 6
Cited 0

A menu of designs for reinforcement learning over time

Neural networks for control
Dual heuristic programming based nonlinear optimal control for a synchronous generator

Engineering Applications of Artificial Intelligence
Adaptive critic motion control design of autonomous wheeled mobile robot by dual heuristic programming

Automatica (Journal of IFAC)
Application of collective robotic search using neural network based dual heuristic programming (DHP)

ISNN'06 Proceedings of the Third international conference on Advnaces in Neural Networks - Volume Part II
Adaptive critic designs

IEEE Transactions on Neural Networks
Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract

Dual Heuristic Programming (DHP) is a class of approximate dynamic programming methods using neural networks. Although there have been some successful applications of DHP, its performance and convergence are greatly influenced by the design of the step sizes in the critic module as well as the actor module. In this paper, a Delta-Bar-Delta learning rule is proposed for the DHP algorithm, which helps the two modules adjust learning rate individually and adaptively. Finally, the feasibility and effectiveness of the proposed method are illustrated in the learning control task of an inverted pendulum.