Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

Authors:
A. Al-Tamimi;F. L. Lewis;M. Abu-Khalaf
Affiliations:
Hashemite Univ., Zarqa;-;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Year:
2008

Citing 0
Cited 24

2009 Special Issue: Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence

Neural Networks
Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints

IEEE Transactions on Neural Networks
Reinforcement learning and adaptive dynamic programming for feedback control

IEEE Circuits and Systems Magazine
Constrained controller design for a class of nonlinear discrete-time uncertain systems

ACC'09 Proceedings of the 2009 conference on American Control Conference
Adaptive dynamic programming-based optimal control of unknown affine nonlinear discrete-time systems

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Adaptive dynamic programming: an introduction

IEEE Computational Intelligence Magazine
Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming

Neurocomputing
PH optimal control in the clarifying process of sugar cane juice based on DHP

ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
Brief paper: Optimality and convergence of adaptive optimal control by reinforcement synthesis

Automatica (Journal of IFAC)
Optimal control for a class of unknown nonlinear systems via the iterative GDHP algorithm

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part II
Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach

Neurocomputing
Adaptive dynamic programming-based optimal control of unknown nonaffine nonlinear discrete-time systems with proof of convergence

Neurocomputing
Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming

Automatica (Journal of IFAC)
Temperature control in water-gas shift reaction with adaptive dynamic programming

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs

Information Sciences: an International Journal
A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

Automatica (Journal of IFAC)
The optimal control of discrete-time delay nonlinear system with dual heuristic dynamic programming

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Discrete-time inverse optimal neural control for synchronous generators

Engineering Applications of Artificial Intelligence
Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm

Neurocomputing
Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique

Neurocomputing
Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm

Neurocomputing
Neural inverse optimal control applied to type 1 diabetes mellitus patients

Analog Integrated Circuits and Signal Processing
Fixed-final-time optimal tracking control of input-affine nonlinear systems

Neurocomputing
Dual Heuristic dynamic Programming for nonlinear discrete-time uncertain systems with state delay

Neurocomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural networks (NN) are used: a critic NN is used to approximate the value function, whereas an action network is used to approximate the optimal control policy. It is stressed that this approach allows the implementation of HDP without knowing the internal dynamics of the system. The exact solution assumption holds for some classes of nonlinear systems and, specifically, in the specific case of the DT linear quadratic regulator (LQR), where the action is linear and the value quadratic in the states and NNs have zero approximation error. It is stressed that, for the LQR, HDP may be implemented without knowing the system A matrix by using two NNs. This fact is not generally appreciated in the folklore of HDP for the DT LQR, where only one critic NN is generally used.