Markov decision models with weighted discounted criteria
Mathematics of Operations Research
Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach
Mathematics of Operations Research
Continuous-Time Markov Decision Processes with Discounted Rewards: The Case of Polish Spaces
Mathematics of Operations Research
Markov decision processes with exponentially representable discounting
Operations Research Letters
Hi-index | 0.00 |
We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. By using the dynamic programming approach, we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.