Adaptive critic designs: a case study for neurocontrol
Neural Networks
Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Generalized policy iteration for continuous-time systems
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Direct heuristic dynamic programming for nonlinear tracking control with filtered tracking error
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Direct Heuristic Dynamic Programming for Damping Oscillations in a Large Power System
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Issues on Stability of ADP Feedback Controllers for Dynamical Systems
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Automatica (Journal of IFAC)
Online learning control by association and reinforcement
IEEE Transactions on Neural Networks
Helicopter trimming and tracking control using direct neural dynamic programming
IEEE Transactions on Neural Networks
Stochastic choice of basis functions in adaptive function approximation and the functional-link net
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programming (direct HDP) is one of the ADP algorithms inspired by the adaptive critic designs. It has been shown applicable to industrial scale, realistic and complex control problems. In this paper, we provide a uniformly ultimately boundedness (UUB) result for the direct HDP learning controller under mild and intuitive conditions. By using a Lyapunov approach we show that the estimation errors of the learning parameters or the weights in the action and critic networks remain UUB. This result provides a useful controller convergence guarantee for the first time for the direct HDP design.