Stochastic dynamic programming with factored representations
Artificial Intelligence
Least-squares policy iteration
The Journal of Machine Learning Research
Dynamic programming for structured continuous Markov decision problems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Solving factored MDPs with continuous and discrete variables
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Winning back the CUP for distributed POMDPs: planning over continuous belief spaces
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Lazy approximation for solving continuous finite-horizon MDPs
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
A fast analytical algorithm for solving Markov decision processes with real-valued resources
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Planning with continuous resources in stochastic domains
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Planning under continuous time and resource uncertainty: a challenge for AI
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Provably Efficient Learning with Typed Parametric Models
The Journal of Machine Learning Research
Algorithms and mechanisms for procuring services with uncertain durations using redundancy
Artificial Intelligence
Hi-index | 0.00 |
Agents often have to construct plans that obey resource limits for continuous resources whose consumption can only be characterized by probability distributions. While Markov Decision Processes (MDPs) with a state space of continuous and discrete variables are popular for modeling these domains, current algorithms for such MDPs can exhibit poor performance with a scale-up in their state space. To remedy that we propose an algorithm called DPFP. DPFP's key contribution is its exploitation of the dual space cumulative distribution functions. This dual formulation is key to DPFP's novel combination of three features. First, it enables DPFP's membership in a class of algorithms that perform forward search in a large (possibly infinite) policy space. Second, it provides a new and efficient approach for varying the policy generation effort based on the likelihood of reaching different regions of the MDP state space. Third, it yields a bound on the error produced by such approximations. These three features conspire to allow DPFP's superior performance and systematic trade-off of optimality for speed. Our experimental evaluation shows that, when run stand-alone, DPFP outperforms other algorithms in terms of its any-time performance, whereas when run as a hybrid, it allows for a significant speedup of a leading continuous resource MDP solver.