Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

  • Authors:
  • Thomas Degris;Olivier Sigaud;Pierre-Henri Wuillemin

  • Affiliations:
  • Université Pierre et Marie Curie - Paris6, Paris, France F-75005;Université Pierre et Marie Curie - Paris6, Paris, France F-75005;Université Pierre et Marie Curie - Paris6, Paris, France F-75005

  • Venue:
  • Recent Advances in Reinforcement Learning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

sdyna is a framework able to address large, discrete and stochastic reinforcement learning problems. It incrementally learns a fmdp representing the problem to solve while using fmdp planning techniques to build an efficient policy. spiti , an instantiation of sdyna , uses a planning method based on dynamic programming which cannot exploit the additive structure of a fmdp . In this paper, we present two new instantiations of sdyna , namely ulp and unatlp , using a linear programming based planning method that can exploit the additive structure of a fmdp and address problems out of reach of spiti .