Brief paper: Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI

Authors:
J. -H. Kim;F. L. Lewis
Affiliations:
Department of Electronics Engineering, Chungbuk National University, Chungbuk 361-763, Republic of Korea;Automation and Robotics Research Institute, The University of Texas at Arlington, Fort Worth, TX 76118, USA
Venue:
Automatica (Journal of IFAC)
Year:
2010

Citing 2
Cited 5

Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation

Automatica (Journal of IFAC)
Brief paper: Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control

Automatica (Journal of IFAC)

Voting in multi-agent system for improvement of partial observations

KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs

Information Sciences: an International Journal
From model-based control to data-driven control: Survey, classification and perspective

Information Sciences: an International Journal
Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm

Neurocomputing
Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique

Neurocomputing

Quantified Score

Hi-index	22.14

Visualization

Abstract

This paper develops a model-free H"~ control design algorithm for unknown linear discrete-time systems by using Q-learning, which is a reinforcement learning method based on an actor-critic structure. In model-free design, there is no known dynamical model of the system. Thus, one has no information on the system matrices, but can access the state variables and input variables. The paper derives an iterative solution algorithm for H"~ control design that is based on policy iteration. The algorithm is expressed in the form of linear matrix inequalities (LMI) that do not involve the system matrices, but only require data measured from the system state and input. It is shown that, for sufficiently rich enough disturbance, this algorithm converges to the standard H"~ control solution obtained using the exact system model. Two numerical examples are given to show the effectiveness in obtaining the H"~ control without any using knowledge of the system dynamics matrices, and the examples show that the results converge to the ones obtained with the exact system dynamics matrices.