Escaping local optima in POMDP planning as inference

Authors:
Pascal Poupart;Tobias Lang;Marc Toussaint
Affiliations:
University of Waterloo, Ontario, Canada;Machine Learning and Robotics Lab, FU Berlin, Germany;Machine Learning and Robotics Lab, FU Berlin, Germany
Venue:
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Year:
2011

Citing 3
Cited 0

Exploiting structure to efficiently solve large scale partially observable markov decision processes

Exploiting structure to efficiently solve large scale partially observable markov decision processes
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Solving POMDPs using quadratically constrained linear programs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Planning as inference recently emerged as a versatile approach to decision-theoretic planning and reinforcement learning for single and multi-agent systems in fully and partially observable domains with discrete and continuous variables. Since planning as inference essentially tackles a non-convex optimization problem when the states are partially observable, there is a need to develop techniques that can robustly escape local optima. We propose two algorithms: the first one adds nodes to the controller according to an increasingly deep forward search, while the second one splits nodes in a greedy fashion to improve reward likelihood.