PPCP: efficient probabilistic planning with clear preferences in partially-known environments

Authors:
Maxim Likhachev;Anthony Stentz
Affiliations:
The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA;The Robotics, Carnegie Mellon University, Pittsburgh, PA
Venue:
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Year:
2006

Citing 6
Cited 2

The complexity of Markov decision processes

Mathematics of Operations Research
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
LAO: a heuristic search algorithm that finds solutions with loops

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Problem-Solving Methods in Artificial Intelligence

Problem-Solving Methods in Artificial Intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Faster heuristic search algorithms for planning with uncertainty and full feedback

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Canadian traveler problem with remote sensing

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Deterministic POMDPs revisited

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

For most real-world problems the agent operates in only partially-known environments. Probabilistic planners can reason over the missing information and produce plans that take into account the uncertainty about the environment. Unfortunately though, they can rarely scale up to the problems that are of interest in real-world. In this paper, however, we show that for a certain subset of problems we can develop a very efficient probabilistic planner. The proposed planner, called PPCP, is applicable to the problems for which it is clear what values of the missing information would result in the best plan. In other words, there exists a clear preference for the actual values of the missing information. For example, in the problem of robot navigation in partially-known environments it is always preferred to find out that an initially unknown location is traversable rather than not. The planner we propose exploits this property by using a series of deterministic A*-like searches to construct and refine a policy in anytime fashion. On the theoretical side, we show that once converged, the policy is guaranteed to be optimal under certain conditions. On the experimental side, we show the power of PPCP on the problem of robot navigation in partially-known terrains. The planner can scale up to very large environments with thousands of initially unknown locations. We believe that this is several orders of magnitude more unknowns than what the current probabilistic planners developed for the same problem can handle. Also, despite the fact that the problem we experimented on in general does not satisfy the conditions for the solution optimality, PPCP still produces the solutions that are nearly always optimal.