Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Authors:
Liuer Ye;Xianping Guo
Affiliations:
Department of Statistics, College of Economics, Jinan University, Guangzhou, P.R. China 510632;School of Mathematics and Computational Science, Sun-Yat Sen University, Guangzhou, P.R. China 510275
Venue:
Acta Applicandae Mathematicae: an international survey journal on applying mathematics and mathematical applications
Year:
2012

Citing 4
Cited 0

Markov decision models with weighted discounted criteria

Mathematics of Operations Research
Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach

Mathematics of Operations Research
Continuous-Time Markov Decision Processes with Discounted Rewards: The Case of Polish Spaces

Mathematics of Operations Research
Markov decision processes with exponentially representable discounting

Operations Research Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. By using the dynamic programming approach, we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.