Linear Programming and Constrained Average Optimality for General Continuous-Time Markov Decision Processes in History-Dependent Policies

Authors:
Xianping Guo;Yonghui Huang;Xinyuan Song
Affiliations:
mcsgxp@mail.sysu.edu.cn and hyongh5@mail.sysu.edu.cn;-;xysong@sta.cuhk.edu.hk
Venue:
SIAM Journal on Control and Optimization
Year:
2012

Citing 8
Cited 0

The problem of convex programming with linear constraints

Computational Mathematics and Mathematical Physics
Continuous-time Markov chains and applications: a singular perturbation approach

Continuous-time Markov chains and applications: a singular perturbation approach
Infinite Linear Programming and Multichain Markov Control Processes in Uncountable Spaces

SIAM Journal on Control and Optimization
Stochastic dynamic programming and the control of queueing systems

Stochastic dynamic programming and the control of queueing systems
Constrained Average Cost Markov Control Processes in Borel Spaces

SIAM Journal on Control and Optimization
Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach

Mathematics of Operations Research
Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards

SIAM Journal on Control and Optimization
Ergodic Control of Continuous-Time Markov Chains with Pathwise Constraints

SIAM Journal on Control and Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper attempts to study the constrained average optimality for continuous-time Markov decision processes in the class of randomized history-dependent policies. The states and actions are in general Polish spaces, and the transition rates are allowed to be unbounded. The optimality criterion to be optimized is expected average costs, multiple constraints are imposed on similar expected average costs, and all costs may be unbounded from above and from below. Under suitable conditions, we first show the existence of a constrained optimal policy by improving the concept of a stable policy in the previous literature and using the analogue of the forward Kolmogorov equation. Then, we develop a linear program (LP), which is equivalent to the constrained optimality problem and is used to obtain a constrained optimal policy. By introducing suitable operators and conditions, we further establish the dual program (DP) of the LP, show that the LP and DP are solvable, and show that there is no duality gap between them. Finally, we use a cash flow model and a controlled birth and death system to illustrate the applications of our main results.