A model for reasoning about persistence and causation
Computational Intelligence
Stochastic dynamic programming with factored representations
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
The anticipatory classifier system and genetic generalization
Natural Computing: an international journal
Incremental Induction of Decision Trees
Machine Learning
Learning the structure of Factored Markov Decision Processes in reinforcement learning problems
ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning classifier systems: a survey
Soft Computing - A Fusion of Foundations, Methodologies and Applications
The many faces of optimism: a unifying approach
Proceedings of the 25th international conference on Machine learning
Anticipatory Learning Classifier Systems and Factored Reinforcement Learning
Anticipatory Behavior in Adaptive Learning Systems
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Correlated action effects in decision theoretic regression
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
The Factored Markov Decision Process (fmdp ) framework is a standard representation for sequential decision problems under uncertainty where the state is represented as a collection of random variables. Factored Reinforcement Learning (frl ) is an Model-based Reinforcement Learning approach to fmdps where the transition and reward functions of the problem are learned. In this paper, we show how to model in a theoretically well-founded way the problems where some combinations of state variable values may not occur, giving rise to impossible states. Furthermore, we propose a new heuristics that considers as impossible the states that have not been seen so far. We derive an algorithm whose improvement in performance with respect to the standard approach is illustrated through benchmark experiments.