Monte Carlo TD(λ)-methods for the optimal control of discrete-time Markovian jump linear systems

  • Authors:
  • Oswaldo L. V. Costa;Julio C. C. Aya

  • Affiliations:
  • Departamento de Engenharia de Telecomunicaçíes e Controle, Escola Politécnica da Universidade de São Paulo, CEP: 05508 900 São Paulo SP Brazil;Departamento de Engenharia de Telecomunicaçíes e Controle, Escola Politécnica da Universidade de São Paulo, CEP: 05508 900 São Paulo SP Brazil

  • Venue:
  • Automatica (Journal of IFAC)
  • Year:
  • 2002

Quantified Score

Hi-index 22.15

Visualization

Abstract

In this paper, we present an iterative technique based on Monte Carlo simulations for deriving the optimal control of the infinite horizon linear regulator problem of discrete-time Markovian jump linear systems for the case in which the transition probability matrix of the Markov chain is not known. We trace a parallel with the theory of TD(@l) algorithms for Markovian decision processes to develop a TD(@l) like algorithm for the optimal control associated to the maximal solution of a set of coupled algebraic Riccati equations (CARE). It is assumed that either there is a sample of past observations of the Markov chain that can be used for the iterative algorithm, or it can be generated through a computer program. Our proofs rely on the spectral radius of the closed loop operators associated to the mean square stability of the system being less than 1.