Brief paper: Adaptive optimal control for continuous-time linear systems based on policy iteration

Authors:
D. Vrabie;O. Pastravanu;M. Abu-Khalaf;F. L. Lewis
Affiliations:
Automation and Robotics Research Institute, The University of Texas at Arlington, 7300 Jack Newell Blvd. S., Ft. Worth, TX 76118, USA;Technical University "Gh. Asachi"-Automatic Control Department, Blvd. D. Mangeron 53A, 700050 Iasi, Romania;The Mathworks Inc., 3 Apple Hill Drive, Natick, MA 01760, USA;Automation and Robotics Research Institute, The University of Texas at Arlington, 7300 Jack Newell Blvd. S., Ft. Worth, TX 76118, USA
Venue:
Automatica (Journal of IFAC)
Year:
2009

Citing 6
Cited 15

A numerical algorithm for optimal feedback gains in high dimensional linear quadratic regulator problems

SIAM Journal on Control and Optimization
Robust nonlinear control design: state-space and Lyapunov techniques

Robust nonlinear control design: state-space and Lyapunov techniques
Analysis and modification of Newton's method for algebraic Riccati equations

Mathematics of Computation
Stabilization of Nonlinear Uncertain Systems

Stabilization of Nonlinear Uncertain Systems
Brief paper: Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control

Automatica (Journal of IFAC)
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach

Automatica (Journal of IFAC)

Reinforcement learning and adaptive dynamic programming for feedback control

IEEE Circuits and Systems Magazine
Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Generalized policy iteration for continuous-time systems

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem

Automatica (Journal of IFAC)
Optimal control for a class of unknown nonlinear systems via the iterative GDHP algorithm

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part II
Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

Automatica (Journal of IFAC)
Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach

Neurocomputing
Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality

Automatica (Journal of IFAC)
Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics

Automatica (Journal of IFAC)
Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems

Automatica (Journal of IFAC)
Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control

Information Sciences: an International Journal
Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm

Neurocomputing
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems

Automatica (Journal of IFAC)
Fixed-final-time optimal tracking control of input-affine nonlinear systems

Neurocomputing

Quantified Score

Hi-index	22.16

Visualization

Abstract

In this paper we propose a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics. In other words, the algorithm solves online an algebraic Riccati equation without knowing the internal dynamics model of the system. Being based on a policy iteration technique, the algorithm alternates between the policy evaluation and policy update steps until an update of the control policy will no longer improve the system performance. The result is a direct adaptive control algorithm which converges to the optimal control solution without using an explicit, a priori obtained, model of the system internal dynamics. The effectiveness of the algorithm is shown while finding the optimal-load-frequency controller for a power system.