Towards faster planning with continuous resources in stochastic domains

Authors:
Janusz Marecki;Milind Tambe
Affiliations:
Computer Science Department, University of Southern California, Los Angeles, CA;Computer Science Department, University of Southern California, Los Angeles, CA
Venue:
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Year:
2008

Citing 10
Cited 2

Stochastic dynamic programming with factored representations

Artificial Intelligence
Least-squares policy iteration

The Journal of Machine Learning Research
Dynamic programming for structured continuous Markov decision problems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Solving factored MDPs with continuous and discrete variables

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Winning back the CUP for distributed POMDPs: planning over continuous belief spaces

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Lazy approximation for solving continuous finite-horizon MDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
A fast analytical algorithm for solving Markov decision processes with real-valued resources

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Planning with continuous resources in stochastic domains

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Stationary deterministic policies for constrained MDPs with multiple rewards, costs, and discount factors

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Provably Efficient Learning with Typed Parametric Models

The Journal of Machine Learning Research
Algorithms and mechanisms for procuring services with uncertain durations using redundancy

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Agents often have to construct plans that obey resource limits for continuous resources whose consumption can only be characterized by probability distributions. While Markov Decision Processes (MDPs) with a state space of continuous and discrete variables are popular for modeling these domains, current algorithms for such MDPs can exhibit poor performance with a scale-up in their state space. To remedy that we propose an algorithm called DPFP. DPFP's key contribution is its exploitation of the dual space cumulative distribution functions. This dual formulation is key to DPFP's novel combination of three features. First, it enables DPFP's membership in a class of algorithms that perform forward search in a large (possibly infinite) policy space. Second, it provides a new and efficient approach for varying the policy generation effort based on the likelihood of reaching different regions of the MDP state space. Third, it yields a bound on the error produced by such approximations. These three features conspire to allow DPFP's superior performance and systematic trade-off of optimality for speed. Our experimental evaluation shows that, when run stand-alone, DPFP outperforms other algorithms in terms of its any-time performance, whereas when run as a hybrid, it allows for a significant speedup of a leading continuous resource MDP solver.