Improving anytime point-based value iteration using principled point selections

Authors:
Michael R. James;Michael E. Samples;Dmitri A. Dolgov
Affiliations:
AI and Robotics Group, Technical Research, Toyota Technical Center;AI and Robotics Group, Technical Research, Toyota Technical Center;AI and Robotics Group, Technical Research, Toyota Technical Center
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 5
Cited 1

Predictive state representations: a new theory for modeling dynamical systems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Using core beliefs for point-based value iteration

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Model-based online learning of POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning

Generalized point based value iteration for interactive POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Planning in partially-observable dynamical systems (such as POMDPs and PSRs) is a computationally challenging task. Popular approximation techniques that have proven successful are point-based planning methods including pointbased value iteration (PBVI), which works by approximating the solution at a finite set of points. These point-based methods typically are anytime algorithms, whereby an initial solution is obtained using a small set of points, and the solution may be incrementally improved by including additional points. We introduce a family of anytime PBVI algorithms that use the information present in the current solution for identifying and adding new points that have the potential to best improve the next solution. We motivate and present two different methods for choosing points and evaluate their performance empirically, demonstrating that high-quality solutions can be obtained with significantly fewer points than previous PBVI approaches.