Robust combination of local controllers

Authors:
Carlos Guestrin;Dirk Ormoneit
Affiliations:
Department of Computer Science, Stanford University, Stanford, CA;Department of Computer Science, Stanford University, Stanford, CA
Venue:
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Year:
2001

Citing 7
Cited 0

Approximation schemes for the restricted shortest path problem

Mathematics of Operations Research
Bounded-parameter Markov decision process

Artificial Intelligence
Robot Motion Planning

Robot Motion Planning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Complexity of the mover's problem and generalizations

SFCS '79 Proceedings of the 20th Annual Symposium on Foundations of Computer Science
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Flexible decomposition algorithms for weakly coupled Markov decision problems

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding solutions to high dimensional Markov Decision Processes (MDPs) is a difficult problem, especially in the presence of uncertainty or if the actions and time measurements are continuous. Frequently this difficulty can be alleviated by the availability of problem-specific knowledge. For example, it may be relatively easy to design controllers that are good locally, though having no global guarantees. We propose a nonparametric method to combine these local controllers to obtain globally good solutions. We apply this formulation to two types of problems: motion planning (stochastic shortest path problems) and discounted-cost MDPs. For motion planning, we argue that only considering the expected cost of a path may be overly simplistic in the presence of uncertainty. We propose an alternative: finding the minimum cost path, subject to the constraint that the robot must reach the goal with high probability. For this problem, we prove that a polynomial number of samples is sufficient to obtain a high probability path. For discounted MDPs, we consider various problem formulations that explicitly deal with model uncertainty. We provide empirical evidence of the usefulness of these approaches using the control of a robot arm.