Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

Authors:
Thomas Degris;Olivier Sigaud;Pierre-Henri Wuillemin
Affiliations:
Université Pierre et Marie Curie - Paris6, Paris, France F-75005;Université Pierre et Marie Curie - Paris6, Paris, France F-75005;Université Pierre et Marie Curie - Paris6, Paris, France F-75005
Venue:
Recent Advances in Reinforcement Learning
Year:
2008

Citing 10
Cited 0

A model for reasoning about persistence and causation

Computational Intelligence
Stochastic dynamic programming with factored representations

Artificial Intelligence
The Frame Problem and Bayesian Network Action Representation

AI '96 Proceedings of the 11th Biennial Conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Learning the structure of Factored Markov Decision Processes in reinforcement learning problems

ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning basis functions in hybrid domains

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
On the role of context-specific independence in probabilistic inference

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Computing factored value functions for policies in structured MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

sdyna is a framework able to address large, discrete and stochastic reinforcement learning problems. It incrementally learns a fmdp representing the problem to solve while using fmdp planning techniques to build an efficient policy. spiti , an instantiation of sdyna , uses a planning method based on dynamic programming which cannot exploit the additive structure of a fmdp . In this paper, we present two new instantiations of sdyna , namely ulp and unatlp , using a linear programming based planning method that can exploit the additive structure of a fmdp and address problems out of reach of spiti .