Adaptive control: stability, convergence, and robustness
Adaptive control: stability, convergence, and robustness
Multilayer feedforward networks are universal approximators
Neural Networks
A menu of designs for reinforcement learning over time
Neural networks for control
Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
Automatica (Journal of IFAC)
Nonlinear and Adaptive Control Design
Nonlinear and Adaptive Control Design
Neuro-Fuzzy Control of Industrial Systems with Actuator Nonlinearities
Neuro-Fuzzy Control of Industrial Systems with Actuator Nonlinearities
Nonlinear Control of Engineering Systems: A Lyapunov-Based Approach
Nonlinear Control of Engineering Systems: A Lyapunov-Based Approach
SIAM Journal on Control and Optimization
Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence)
Reinforcement Learning in Continuous Time and Space
Neural Computation
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
Automatica (Journal of IFAC)
Reinforcement Learning and Dynamic Programming Using Function Approximators
Reinforcement Learning and Dynamic Programming Using Function Approximators
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Automatica (Journal of IFAC)
IEEE Transactions on Neural Networks
On integral generalized policy iteration for continuous-time linear quadratic regulations
Automatica (Journal of IFAC)
Hi-index | 22.15 |
An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor-critic-identifier (ACI) is proposed to approximate the Hamilton-Jacobi-Bellman equation using three neural network (NN) structures-actor and critic NNs approximate the optimal control and the optimal value function, respectively, and a robust dynamic neural network identifier asymptotically approximates the uncertain system dynamics. An advantage of using the ACI architecture is that learning by the actor, critic, and identifier is continuous and simultaneous, without requiring knowledge of system drift dynamics. Convergence of the algorithm is analyzed using Lyapunov-based adaptive control methods. A persistence of excitation condition is required to guarantee exponential convergence to a bounded region in the neighborhood of the optimal control and uniformly ultimately bounded (UUB) stability of the closed-loop system. Simulation results demonstrate the performance of the actor-critic-identifier method for approximate optimal control.