Linear robust control
Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
Automatica (Journal of IFAC)
Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation
Journal of Optimization Theory and Applications
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A fuzzy Actor-Critic reinforcement learning network
Information Sciences: an International Journal
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
Brief paper: Adaptive optimal control for continuous-time linear systems based on policy iteration
Automatica (Journal of IFAC)
Reinforcement learning and adaptive dynamic programming for feedback control
IEEE Circuits and Systems Magazine
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
Automatica (Journal of IFAC)
Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Information Sciences: an International Journal
Self-organizing state aggregation for architecture design of Q-learning
Information Sciences: an International Journal
Online learning algorithms for differential dynamic games and optimal control
Online learning algorithms for differential dynamic games and optimal control
Neurodynamic Programming and Zero-Sum Games for Constrained Control Systems
IEEE Transactions on Neural Networks
Induced states in a decision tree constructed by Q-learning
Information Sciences: an International Journal
Hi-index | 0.07 |
It is well known that the H"~ state feedback control problem can be viewed as a two-player zero-sum game and reduced to find a solution of the algebra Riccati equation (ARE). In this paper, we propose a simultaneous policy update algorithm (SPUA) for solving the ARE, and develop offline and online versions. The offline SPUA is a model-based approach, which obtains the solution of the ARE by solving a sequence of Lyapunov equations (LEs). Its convergence is established rigorously by constructing a Newton's sequence for the fixed point equation. The online SPUA is a partially model-free approach, which takes advantage of the thought of reinforcement learning (RL) to learn the solution of the ARE online without requiring the internal system dynamics, wherein both players update their action policies simultaneously. The convergence of the online SPUA is proved by demonstrating that it is mathematically equivalent to the offline SPUA. Finally, by conducting comparative simulation studies on an F-16 aircraft plant and a power system, the results show that both the offline SPUA and the online SPUA can find the solution of the ARE, and achieve much better convergence than the existing methods.