Incremental plan aggregation for generating policies in MDPs

Authors:
Florent Teichteil-Königsbuch;Ugur Kuter;Guillaume Infantes
Affiliations:
ONERA-DCSD, Toulouse Cedex, France;University of Maryland, College Park, MD;ONERA-DCSD, Toulouse Cedex, France
Venue:
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Year:
2010

Citing 9
Cited 6

Abstraction and approximate decision-theoretic planning

Artificial Intelligence
Stochastic dynamic programming with factored representations

Artificial Intelligence
LAO: a heuristic search algorithm that finds solutions with loops

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
The Linear Programming Approach to Approximate Dynamic Programming

Operations Research
Automated Planning: Theory & Practice

Automated Planning: Theory & Practice
Probabilistic planning via determinization in hindsight

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
The FF planning system: fast plan generation through heuristic search

Journal of Artificial Intelligence Research
The metric-FF planning system: translating "Ignoring delete lists" to numeric state variables

Journal of Artificial Intelligence Research

Planning with noisy probabilistic relational rules

Journal of Artificial Intelligence Research
Stochastic enforced hill-climbing

Journal of Artificial Intelligence Research
A decision-theoretic academic advisor

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Plan-based policies for efficient multiple battery load management

Journal of Artificial Intelligence Research
The actor's view of automated planning and acting: A position paper

Artificial Intelligence
Robotics and artificial intelligence: A perspective on deliberation functions

AI Communications - ECAI 2012 Turing and Anniversary Track

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite the recent advances in planning with MDPs, the problem of generating good policies is still hard. This paper describes a way to generate policies in MDPs by (1) determinizing the given MDP model into a classical planning problem; (2) building partial policies off-line by producing solution plans to the classical planning problem and incrementally aggregating them into a policy, and (3) using sequential Monte-Carlo (MC) simulations of the partial policies before execution, in order to assess the probability of replanning for a policy during execution. The objective of this approach is to quickly generate policies whose probability of replanning is low and below a given threshold. We describe our planner RFF, which incorporates the above ideas. We present theorems showing the termination, soundness and completeness properties of RFF. RFF was the winner of the fully-observable probabilistic track in the 2008 International Planning Competition (IPC-08). In addition to our analyses of the IPC-08 results, we analyzed RFF's performance with different plan aggregation and determinization strategies, with varying amount of MC sampling, and with varying threshold values for probability of replanning. The results of these experiments revealed how they impact the time performance of RFF to generate solution policies and the quality of those solution policies (i.e., the average accumulated reward gathered from the execution of the policies).