Using bisimulation for policy transfer in MDPs

Authors:
Pablo S. Castro;Doina Precup
Affiliations:
McGill University, Montreal, QC, Canada;McGill University, Montreal, QC Canada
Venue:
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Year:
2010

Citing 3
Cited 1

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Metrics for finite Markov decision processes

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research

Automatic construction of temporally extended actions for MDPs using bisimulation metrics

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much of the work on using Markov Decision Processes (MDPs) in artificial intelligence (AI) focuses on solving a single problem. However, AI agents often exist over a long period of time, during which they may be required to solve several related tasks. This type of scenario has motivated a significant amount of recent research in knowledge transfer methods for MDPs. The idea is to allow an agent to continue to re-use the expertise accumulated while solving past tasks over its lifetime (see Taylor & Stone, 2009, for a comprehensive survey).