Decision tree methods for finding reusable MDP homomorphisms

Authors:
Alicia Peregrin Wolfe;Andrew G. Barto
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, Amherst, MA
Venue:
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Year:
2006

Citing 14
Cited 2

Learning to Perceive and Act by Trial and Error

Machine Learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Bounded-parameter Markov decision process

Artificial Intelligence
Discovering Hierarchy in Reinforcement Learning with HEXQ

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
An algebraic approach to abstraction in reinforcement learning

An algebraic approach to abstraction in reinforcement learning
Tree-Based Batch Mode Reinforcement Learning

The Journal of Machine Learning Research
Interactive learning of mappings from visual percepts to actions

ICML '05 Proceedings of the 22nd international conference on Machine learning
A causal approach to hierarchical decomposition of factored MDPs

ICML '05 Proceedings of the 22nd international conference on Machine learning
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Symbolic dynamic programming for first-order MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Feature-Discovering approximate value iteration methods

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation

Transfer via soft homomorphisms

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning to make predictions in partially observable environments without a generative model

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

State abstraction is a useful tool for agents interacting with complex environments. Good state abstractions are compact, reuseable, and easy to learn from sample data. This paper combines and extends two existing classes of state abstraction methods to achieve these criteria. The first class of methods search for MDP homomorphisms (Ravindran 2004), which produce models of reward and transition probabilities in an abstract state space. The second class of methods, like the UTree algorithm (McCallum 1995), learn compact models of the value function quickly from sample data. Models based on MDP homomorphisms can easily be extended such that they are usable across tasks with similar reward functions. However, value based methods like UTree cannot be extended in this fashion. We present results showing a new, combined algorithm that fulfills all three criteria: the resulting models are compact, can be learned quickly from sample data, and can be used across a class of reward functions.