Learning to Perceive and Act by Trial and Error
Machine Learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Bounded-parameter Markov decision process
Artificial Intelligence
Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Equivalence notions and model minimization in Markov decision processes
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Reinforcement learning with selective perception and hidden state
Reinforcement learning with selective perception and hidden state
An algebraic approach to abstraction in reinforcement learning
An algebraic approach to abstraction in reinforcement learning
Tree-Based Batch Mode Reinforcement Learning
The Journal of Machine Learning Research
Interactive learning of mappings from visual percepts to actions
ICML '05 Proceedings of the 22nd international conference on Machine learning
A causal approach to hierarchical decomposition of factored MDPs
ICML '05 Proceedings of the 22nd international conference on Machine learning
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Symbolic dynamic programming for first-order MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Feature-Discovering approximate value iteration methods
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Transfer via soft homomorphisms
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning to make predictions in partially observable environments without a generative model
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
State abstraction is a useful tool for agents interacting with complex environments. Good state abstractions are compact, reuseable, and easy to learn from sample data. This paper combines and extends two existing classes of state abstraction methods to achieve these criteria. The first class of methods search for MDP homomorphisms (Ravindran 2004), which produce models of reward and transition probabilities in an abstract state space. The second class of methods, like the UTree algorithm (McCallum 1995), learn compact models of the value function quickly from sample data. Models based on MDP homomorphisms can easily be extended such that they are usable across tasks with similar reward functions. However, value based methods like UTree cannot be extended in this fashion. We present results showing a new, combined algorithm that fulfills all three criteria: the resulting models are compact, can be learned quickly from sample data, and can be used across a class of reward functions.