Notes on average Markov decision processes with a minimum-variance criterion

  • Authors:
  • Liu Jianyong

  • Affiliations:
  • Institute of Applied Mathematics, Academia Sinica, Beijing 100080, People's Republic of China

  • Venue:
  • Operations Research Letters
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we consider the optimization problem of the variance of the reward for the nonstationary average Markov decision processes (MDP, for short). Examples in this paper show that there are mistakes in proofs of main theorems in two papers, Kurano [(J. Math. Anal. Appl. 123 (1987) 572)] and Guo [(Math. Meth. Oper. Res. 49 (1999) 87-96)] which investigated the optimization problems of the variance of the sun of costs for the average MDP. We propose a variance criterion which is different from that investigated by Kurano (1987) and Guo (1999) and we prove that there exists a Markov policy which is @e-strong variance optimal policy for any @e0 under some appropriate conditions.