NP-Hardness of checking the unichain condition in average cost MDPs

Authors:
John N. Tsitsiklis
Affiliations:
Massachusetts Institute of Technology, Room 32-D662, M.I.T., 77 Massachusetts Avenue, Cambridge, MA 02421, USA
Venue:
Operations Research Letters
Year:
2007

Citing 4
Cited 3

Sensitivity of constrained Markov decision processes

Annals of Operations Research
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies

Mathematics of Operations Research

Online Markov Decision Processes

Mathematics of Operations Research
Dynamic Power Management for Sensor Node in WSN Using Average Reward MDP

WASA '09 Proceedings of the 4th International Conference on Wireless Algorithms, Systems, and Applications
On polynomial cases of the unichain classification problem for Markov Decision Processes

Operations Research Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

The unichain condition requires that every policy in an MDP result in a single ergodic class, and guarantees that the optimal average cost is independent of the initial state. We show that checking whether the unichain condition fails to hold is an NP-complete problem. We conclude with a brief discussion of the merits of the more general weak accessibility condition.