Stochastic planning with concurrent, durative actions

  • Authors:
  • Daniel S. Weld; Mausam

  • Affiliations:
  • University of Washington;University of Washington

  • Venue:
  • Stochastic planning with concurrent, durative actions
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

Controlling intelligent agents in complex, uncertain real-world environments poses requirements that are not addressed in classical planning formulations. Such problems typically involve ability to reason with uncertain action effects and are frequently formulated as Markov Decision Processes (MDPs). MDPs, while an otherwise expressive model, have two limitations - (1) MDP algorithms do not scale well, since they vary polynomially with the size of the state space. Typically the state space is exponential in the number of domain features, making state of the art MDP algorithms impractical for large problems. (2) MDPs allow only for sequential, non-durative actions. This poses severe restrictions in modeling and solving a real world planning problem, since these assumptions are rarely true in practice. In this thesis we present solutions to both these problems. For the former, we present a novel probabilistic planner based on the novel approach to hybridizing two algorithms. In particular, we hybridize GPT , an exact MDP solver with MBP, a planner that plans using a qualitative (non-deterministic) model of uncertainty. Whereas exact MDP solvers produce optimal solutions, qualitative planners are fast and highly scalable. Our hybridized planner, HYBPLAN, is able to obtain the best of both techniques—speed, quality and scalability. Moreover HYBPLAN has excellent anytime properties and makes effective use of available time and memory. For the latter, we extend the MDP model to incorporate - (1) simultaneous action execution, (2) durative actions, and (3) stochastic durations. We develop several algorithms to combat the computational explosion introduced by these features. The key theoretical ideas used in building these algorithms are—modeling a complex problem as an MDP in extended state/action space, pruning of irrelevant actions, sampling of relevant actions, using informed heuristics to guide the search, hybridizing different planners to achieve benefits of both, approximating the problem and replanning. Our empirical evaluation illuminates the different merits in using various algorithms, viz., optimality, empirical closeness to optimality, theoretical error bounds, and speed.