The complexity of Markov decision processes
Mathematics of Operations Research
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
LAO: a heuristic search algorithm that finds solutions with loops
Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Problem-Solving Methods in Artificial Intelligence
Problem-Solving Methods in Artificial Intelligence
Point-based value iteration: an anytime algorithm for POMDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Faster heuristic search algorithms for planning with uncertainty and full feedback
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Canadian traveler problem with remote sensing
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Deterministic POMDPs revisited
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Hi-index | 0.00 |
For most real-world problems the agent operates in only partially-known environments. Probabilistic planners can reason over the missing information and produce plans that take into account the uncertainty about the environment. Unfortunately though, they can rarely scale up to the problems that are of interest in real-world. In this paper, however, we show that for a certain subset of problems we can develop a very efficient probabilistic planner. The proposed planner, called PPCP, is applicable to the problems for which it is clear what values of the missing information would result in the best plan. In other words, there exists a clear preference for the actual values of the missing information. For example, in the problem of robot navigation in partially-known environments it is always preferred to find out that an initially unknown location is traversable rather than not. The planner we propose exploits this property by using a series of deterministic A*-like searches to construct and refine a policy in anytime fashion. On the theoretical side, we show that once converged, the policy is guaranteed to be optimal under certain conditions. On the experimental side, we show the power of PPCP on the problem of robot navigation in partially-known terrains. The planner can scale up to very large environments with thousands of initially unknown locations. We believe that this is several orders of magnitude more unknowns than what the current probabilistic planners developed for the same problem can handle. Also, despite the fact that the problem we experimented on in general does not satisfy the conditions for the solution optimality, PPCP still produces the solutions that are nearly always optimal.