Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems

Authors:
Hamidreza Modares;Frank L. Lewis;Mohammad-Bagher Naghibi-Sistani
Affiliations:
-;-;-
Venue:
Automatica (Journal of IFAC)
Year:
2014

Citing 15
Cited 0

Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

Machine Learning
Neural Network Control of Robot Manipulators and Nonlinear Systems

Neural Network Control of Robot Manipulators and Nonlinear Systems
Reinforcement Learning in Continuous Time and Space

Neural Computation
Batch reinforcement learning in a complex domain

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Brief paper: Adaptive optimal control for continuous-time linear systems based on policy iteration

Automatica (Journal of IFAC)
2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems

Neural Networks
Real-time reinforcement learning by sequential Actor-Critics and experience replay

Neural Networks
Reinforcement learning and adaptive dynamic programming for feedback control

IEEE Circuits and Systems Magazine
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem

Automatica (Journal of IFAC)
Concurrent learning for convergence in adaptive control without persistency of excitation

Concurrent learning for convergence in adaptive control without persistency of excitation
Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses

Automatica (Journal of IFAC)
Adaptive dynamic programming

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach

Automatica (Journal of IFAC)
Experience Replay for Real-Time Reinforcement Learning Control

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

Automatica (Journal of IFAC)

Quantified Score

Hi-index	22.14

Visualization

Abstract

In this paper, an integral reinforcement learning (IRL) algorithm on an actor-critic structure is developed to learn online the solution to the Hamilton-Jacobi-Bellman equation for partially-unknown constrained-input systems. The technique of experience replay is used to update the critic weights to solve an IRL Bellman equation. This means, unlike existing reinforcement learning algorithms, recorded past experiences are used concurrently with current data for adaptation of the critic weights. It is shown that using this technique, instead of the traditional persistence of excitation condition which is often difficult or impossible to verify online, an easy-to-check condition on the richness of the recorded data is sufficient to guarantee convergence to a near-optimal control law. Stability of the proposed feedback control law is shown and the effectiveness of the proposed method is illustrated with simulation examples.