New prioritized value iteration for Markov decision processes

Authors:
Ma. De Garcia-Hernandez;Jose Ruiz-Pinales;Eva Onaindia;J. Gabriel Aviña-Cervantes;Sergio Ledesma-Orozco;Edgar Alvarado-Mendez;Alberto Reyes-Ballesteros
Affiliations:
University of Guanajuato, Guanajuato, Mexico;University of Guanajuato, Guanajuato, Mexico;Universitat Politècnica de València, Valencia, Spain 46022;University of Guanajuato, Guanajuato, Mexico;University of Guanajuato, Guanajuato, Mexico;University of Guanajuato, Guanajuato, Mexico;Electrical Research Institute, Morelos, Mexico 62490
Venue:
Artificial Intelligence Review
Year:
2012

Citing 11
Cited 0

Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
LAO: a heuristic search algorithm that finds solutions with loops

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Focussed Propagation of MDPs for Path Planning

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Prioritization Methods for Accelerating MDP Solvers

The Journal of Machine Learning Research
Simulation-based Algorithms for Markov Decision Processes (Communications and Control Engineering)

Simulation-based Algorithms for Markov Decision Processes (Communications and Control Engineering)
Topological value iteration algorithm for Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Faster heuristic search algorithms for planning with uncertainty and full feedback

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A unifying framework for computational reinforcement learning theory

A unifying framework for computational reinforcement learning theory
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Prioritizing Point-Based POMDP Solvers

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of solving large Markov decision processes accurately and quickly is challenging. Since the computational effort incurred is considerable, current research focuses on finding superior acceleration techniques. For instance, the convergence properties of current solution methods depend, to a great extent, on the order of backup operations. On one hand, algorithms such as topological sorting are able to find good orderings but their overhead is usually high. On the other hand, shortest path methods, such as Dijkstra's algorithm which is based on priority queues, have been applied successfully to the solution of deterministic shortest-path Markov decision processes. Here, we propose an improved value iteration algorithm based on Dijkstra's algorithm for solving shortest path Markov decision processes. The experimental results on a stochastic shortest-path problem show the feasibility of our approach.