A fast analytical algorithm for solving Markov decision processes with real-valued resources

Authors:
Janusz Marecki;Sven Koenig;Milind Tambe
Affiliations:
Computer Science Department, University of Southern California, Los Angeles, CA;Computer Science Department, University of Southern California, Los Angeles, CA;Computer Science Department, University of Southern California, Los Angeles, CA
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 5
Cited 8

Least-squares policy iteration

The Journal of Machine Learning Research
Solving generalized semi-Markov decision processes using continuous phase-type distributions

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Risk-sensitive planning with one-switch utility functions: value iteration

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Lazy approximation for solving continuous finite-horizon MDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

RIAACT: a robust approach to adjustable autonomy for human-multiagent teams

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Solving Decentralized Continuous Markov Decision Problems with Structured Reward

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Towards faster planning with continuous resources in stochastic domains

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
A heuristic search approach to planning with continuous resources in stochastic domains

Journal of Artificial Intelligence Research
Function allocation for NextGen airspace via agents

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Industry track
Planning in stochastic domains for multiple agents with individual continuous resource state-spaces

Autonomous Agents and Multi-Agent Systems
Continuous time planning for multiagent teams with temporal constraints

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Robust optimization for hybrid MDPs with state-dependent noise

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability distributions, such as execution time or battery power. These planning problems can be modeled with continuous state Markov decision processes (MDPs) but existing solution methods are either inefficient or provide no guarantee on the quality of the resulting policy. We therefore present CPH, a novel solution method that solves the planning problems by first approximating with any desired accuracy the probability distributions over the resource consumptions with phasetype distributions, which use exponential distributions as building blocks. It then uses value iteration to solve the resulting MDPs by exploiting properties of exponential distributions to calculate the necessary convolutions accurately and efficiently while providing strong guarantees on the quality of the resulting policy. Our experimental feasibility study in a Mars rover domain demonstrates a substantial speedup over Lazy Approximation, which is currently the leading algorithm for solving continuous state MDPs with quality guarantees.