Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

  • Authors:
  • Liuer Ye;Xianping Guo

  • Affiliations:
  • Department of Statistics, College of Economics, Jinan University, Guangzhou, P.R. China 510632;School of Mathematics and Computational Science, Sun-Yat Sen University, Guangzhou, P.R. China 510275

  • Venue:
  • Acta Applicandae Mathematicae: an international survey journal on applying mathematics and mathematical applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. By using the dynamic programming approach, we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.