Multichain Markov decision processes with a sample path constraint: a decomposition approach
Mathematics of Operations Research
Optimal paths in graphs with stochastic or multidimensional weights
Communications of the ACM
Dynamic Programming: Models and Applications
Dynamic Programming: Models and Applications
COMPUTING AVERAGE OPTIMAL CONSTRAINED POLICIES IN STOCHASTIC DYNAMIC PROGRAMMING
Probability in the Engineering and Informational Sciences
Adaptive control of constrained finite Markov chains
Automatica (Journal of IFAC)
Hi-index | 0.00 |
A multichain Markov decision process with constraints on the expected state-action frequencies may lead to a unique optimal policy which does not satisfy Bellman's principle of optimality. The model with sample-path constraints does not suffer from this drawback.