Discounted Continuous-Time Markov Decision Processes with Constraints: Unbounded Transition and Loss Rates

Authors:
Xianping Guo;Alexei Piunovskiy
Affiliations:
School of Mathematics and Computational Science, Zhongshan University, 510275 Guangzhou, People's Republic of China;Department of Mathematical Sciences, The University of Liverpool, Liverpool L69 7ZL, United Kingdom
Venue:
Mathematics of Operations Research
Year:
2011

Citing 6
Cited 2

The problem of convex programming with linear constraints

Computational Mathematics and Mathematical Physics
Stochastic Optimal Control: The Discrete-Time Case

Stochastic Optimal Control: The Discrete-Time Case
Optimal Interventions in Countable Jump Markov Processes

Mathematics of Operations Research
Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach

Mathematics of Operations Research
Continuous-Time Markov Decision Processes with Discounted Rewards: The Case of Polish Spaces

Mathematics of Operations Research
Moments and sums of squares for polynomial optimization and related problems

Journal of Global Optimization

Controlled stochastic jump processes

MMES'10 Proceedings of the 2010 international conference on Mathematical models for engineering science
Discounted Continuous-Time Markov Decision Processes with Unbounded Rates: The Convex Analytic Approach

SIAM Journal on Control and Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with denumerable continuous-time Markov decision processes (MDP) with constraints. The optimality criterion to be minimized is expected discounted loss, while several constraints of the same type are imposed. The transition rates may be unbounded, the loss rates are allowed to be unbounded as well (from above and from below), and the policies may be history-dependent and randomized. Based on Kolmogorov's forward equation and Dynkin's formula, we remind the reader about the Bellman equation, introduce and study occupation measures, reformulate the optimization problem as a (primary) linear program, provide the form of optimal policies for a constrained optimization problem here, and establish the duality between the convex analytic approach and dynamic programming. Finally, a series of examples is given to illustrate all of our results.