Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

Authors:
Kyriakos G. Vamvoudakis;Frank L. Lewis
Affiliations:
-;-
Venue:
Automatica (Journal of IFAC)
Year:
2011

Citing 8
Cited 5

Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks

Neural Networks
Neural Network Control of Robot Manipulators and Nonlinear Systems

Neural Network Control of Robot Manipulators and Nonlinear Systems
Brief paper: Adaptive optimal control for continuous-time linear systems based on policy iteration

Automatica (Journal of IFAC)
Adaptive optimal controllers based on Generalized Policy Iteration in a continuous-time framework

MED '09 Proceedings of the 2009 17th Mediterranean Conference on Control and Automation
Reinforcement learning and adaptive dynamic programming for feedback control

IEEE Circuits and Systems Magazine
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem

Automatica (Journal of IFAC)
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach

Automatica (Journal of IFAC)
Adaptive neural control of uncertain MIMO nonlinear systems

IEEE Transactions on Neural Networks

2012 Special Issue: An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state

Neural Networks
Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality

Automatica (Journal of IFAC)
Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics

Automatica (Journal of IFAC)
Pareto optimality in infinite horizon linear quadratic differential games

Automatica (Journal of IFAC)
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal

Quantified Score

Hi-index	22.15

Visualization

Abstract

In this paper we present an online adaptive control algorithm based on policy iteration reinforcement learning techniques to solve the continuous-time (CT) multi player non-zero-sum (NZS) game with infinite horizon for linear and nonlinear systems. NZS games allow for players to have a cooperative team component and an individual selfish component of strategy. The adaptive algorithm learns online the solution of coupled Riccati equations and coupled Hamilton-Jacobi equations for linear and nonlinear systems respectively. This adaptive control method finds in real-time approximations of the optimal value and the NZS Nash-equilibrium, while also guaranteeing closed-loop stability. The optimal-adaptive algorithm is implemented as a separate actor/critic parametric network approximator structure for every player, and involves simultaneous continuous-time adaptation of the actor/critic networks. A persistence of excitation condition is shown to guarantee convergence of every critic to the actual optimal value function for that player. A detailed mathematical analysis is done for 2-player NZS games. Novel tuning algorithms are given for the actor/critic networks. The convergence to the Nash equilibrium is proven and stability of the system is also guaranteed. This provides optimal adaptive control solutions for both non-zero-sum games and their special case, the zero-sum games. Simulation examples show the effectiveness of the new algorithm.