Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

  • Authors:
  • Kyriakos G. Vamvoudakis;Frank L. Lewis

  • Affiliations:
  • -;-

  • Venue:
  • Automatica (Journal of IFAC)
  • Year:
  • 2011

Quantified Score

Hi-index 22.15

Visualization

Abstract

In this paper we present an online adaptive control algorithm based on policy iteration reinforcement learning techniques to solve the continuous-time (CT) multi player non-zero-sum (NZS) game with infinite horizon for linear and nonlinear systems. NZS games allow for players to have a cooperative team component and an individual selfish component of strategy. The adaptive algorithm learns online the solution of coupled Riccati equations and coupled Hamilton-Jacobi equations for linear and nonlinear systems respectively. This adaptive control method finds in real-time approximations of the optimal value and the NZS Nash-equilibrium, while also guaranteeing closed-loop stability. The optimal-adaptive algorithm is implemented as a separate actor/critic parametric network approximator structure for every player, and involves simultaneous continuous-time adaptation of the actor/critic networks. A persistence of excitation condition is shown to guarantee convergence of every critic to the actual optimal value function for that player. A detailed mathematical analysis is done for 2-player NZS games. Novel tuning algorithms are given for the actor/critic networks. The convergence to the Nash equilibrium is proven and stability of the system is also guaranteed. This provides optimal adaptive control solutions for both non-zero-sum games and their special case, the zero-sum games. Simulation examples show the effectiveness of the new algorithm.