Discrete-time controlled Markov processes with average cost criterion: a survey
SIAM Journal on Control and Optimization
Stochastic dynamic programming and the control of queueing systems
Stochastic dynamic programming and the control of queueing systems
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
Discrete Event Dynamic Systems
Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards
SIAM Journal on Control and Optimization
Bias Optimality for Continuous-Time Controlled Markov Chains
SIAM Journal on Control and Optimization
Hi-index | 22.14 |
In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias optimal policy iteration algorithms, and show that such an nth-bias optimal policy can be obtained in a finite number of policy iterations.