Equivalence notions and model minimization in Markov decision processes
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Metrics for finite Markov decision processes
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
Linear dependence of stationary distributions in ergodic Markov decision processes
Operations Research Letters
Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Bisimulation Metrics for Continuous Markov Decision Processes
SIAM Journal on Computing
Regret bounds for restless markov bandits
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Hi-index | 0.00 |
We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequatepseudometrics which are well adapted to the structure of the MDP, we show how these may be used for state aggregation. Upper bounds on the loss that may be caused by working on the aggregated instead of the original MDP are given and compared to the bounds that have been achieved for discounted reward MDPs.