Partitioned external-memory value iteration

Authors:
Peng Dai; Mausam;Daniel S. Weld
Affiliations:
Dept. of Computer Science and Engineering, University of Washington, Seattle, WA;Dept. of Computer Science and Engineering, University of Washington, Seattle, WA;Dept. of Computer Science and Engineering, University of Washington, Seattle, WA
Venue:
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Year:
2008

Citing 10
Cited 3

Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
LAO: a heuristic search algorithm that finds solutions with loops

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Dynamic Programming

Dynamic Programming
Prioritization Methods for Accelerating MDP Solvers

The Journal of Machine Learning Research
On the Speed of Convergence of Value Iteration on Stochastic Shortest-Path Problems

Mathematics of Operations Research
Domain-independent structured duplicate detection

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Efficient Probabilistic Model Checking on General Purpose Graphics Processors

Proceedings of the 16th International SPIN Workshop on Model Checking Software
Domain-independent, automatic partitioning for probabilistic planning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Topological value iteration algorithms

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dynamic programming methods (including value iteration, LAO*, RTDP, and derivatives) are popular algorithms for solving Markov decision processes (MDPs). Unfortunately, however, these techniques store the MDP model extensionally in a table and thus are limited by the amount of main memory available. Since the required space is exponential in the number of domain features, these dynamic programming methods are ineffective for large problems. To address this problem, Edelcamp et al. devised the external memory value iteration (EMVI) algorithm, which uses a clever sorting scheme to efficiently move parts of the model between disk and main memory. While EMVI can handle larger problems than previously addressed, the need to repeatedly perform external sorts still limits scalability. This paper proposes a new approach. We partition an MDP into smaller pieces (blocks), keeping just the relevant blocks in memory and performing Bellman backups block by block. Experiments show that our algorithm is able to solve large MDPs an order of magnitude faster than EMVI.