Incremental plan aggregation for generating policies in MDPs

  • Authors:
  • Florent Teichteil-Königsbuch;Ugur Kuter;Guillaume Infantes

  • Affiliations:
  • ONERA-DCSD, Toulouse Cedex, France;University of Maryland, College Park, MD;ONERA-DCSD, Toulouse Cedex, France

  • Venue:
  • Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Despite the recent advances in planning with MDPs, the problem of generating good policies is still hard. This paper describes a way to generate policies in MDPs by (1) determinizing the given MDP model into a classical planning problem; (2) building partial policies off-line by producing solution plans to the classical planning problem and incrementally aggregating them into a policy, and (3) using sequential Monte-Carlo (MC) simulations of the partial policies before execution, in order to assess the probability of replanning for a policy during execution. The objective of this approach is to quickly generate policies whose probability of replanning is low and below a given threshold. We describe our planner RFF, which incorporates the above ideas. We present theorems showing the termination, soundness and completeness properties of RFF. RFF was the winner of the fully-observable probabilistic track in the 2008 International Planning Competition (IPC-08). In addition to our analyses of the IPC-08 results, we analyzed RFF's performance with different plan aggregation and determinization strategies, with varying amount of MC sampling, and with varying threshold values for probability of replanning. The results of these experiments revealed how they impact the time performance of RFF to generate solution policies and the quality of those solution policies (i.e., the average accumulated reward gathered from the execution of the policies).