Metrics for finite Markov decision processes

Authors:
Norm Ferns;Prakash Panangaden;Doina Precup
Affiliations:
McGill University, Montréal, Canada;McGill University, Montréal, Canada;McGill University, Montréal, Canada
Venue:
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Year:
2004

Citing 11
Cited 16

A faster strongly polynomial minimum cost flow algorithm

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Bisimulation through probabilistic testing

Information and Computation
The formal semantics of programming languages: an introduction

The formal semantics of programming languages: an introduction
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
A Calculus of Communicating Systems

A Calculus of Communicating Systems
Performance measure sensitive congruences for Markovian process algebras

Theoretical Computer Science
Diffusion Kernels on Graphs and Other Discrete Input Spaces

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
The Metric Analogue of Weak Bisimulation for Probabilistic Processes

LICS '02 Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science
Concurrency and Automata on Infinite Sequences

Proceedings of the 5th GI-Conference on Theoretical Computer Science
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Model reduction techniques for computing approximately optimal solutions for Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Transfer of samples in batch reinforcement learning

Proceedings of the 25th international conference on Machine learning
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Improving Batch Reinforcement Learning Performance through Transfer of Samples

Proceedings of the 2008 conference on STAIRS 2008: Proceedings of the Fourth Starting AI Researchers' Symposium
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
The Kantorovich Metric in Computer Science: A Brief Survey

Electronic Notes in Theoretical Computer Science (ENTCS)
Equivalence relations in fully and partially observable Markov decision processes

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Using bisimulation for policy transfer in MDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Smarter sampling in model-based Bayesian reinforcement learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Learning from demonstration using MDP induced metrics

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Basis function discovery using spectral clustering and bisimulation metrics

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Bisimulation Metrics for Continuous Markov Decision Processes

SIAM Journal on Computing
Automatic construction of temporally extended actions for MDPs using bisimulation metrics

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
A pseudometric in supervisory control of probabilistic discrete event systems

Discrete Event Dynamic Systems
On-the-Fly exact computation of bisimilarity distances

TACAS'13 Proceedings of the 19th international conference on Tools and Algorithms for the Construction and Analysis of Systems
The bisimdist library: efficient computation of bisimilarity distances for markovian models

QEST'13 Proceedings of the 10th international conference on Quantitative Evaluation of Systems
Approximation Metrics Based on Probabilistic Bisimulations for General State-Space Markov Processes: A Survey

Electronic Notes in Theoretical Computer Science (ENTCS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present metrics for measuring the similarity of states in a finite Markov decision process (MDP). The formulation of our metrics is based on the notion of bisimulation for MDPs, with an aim towards solving discounted infinite horizon reinforcement learning tasks. Such metrics can be used to aggregate states, as well as to better structure other value function approximators (e.g., memory-based or nearest-neighbor approximators). We provide bounds that relate our metric distances to the optimal values of states in the given MDP.