Expediting RL by using graphical structures
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Multigrid Reinforcement Learning with Reward Shaping
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Scaling up: solving POMDPs through value based clustering
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Partitioned external-memory value iteration
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Learning policies for embodied virtual agents through demonstration
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Domain-independent, automatic partitioning for probabilistic planning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Topological order planner for POMDPs
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Ranking policies in discrete Markov decision processes
Annals of Mathematics and Artificial Intelligence
Robot-Assisted Needle Steering Using a Control Theoretic Approach
Journal of Intelligent and Robotic Systems
Prioritizing point-based POMDP solvers
ECML'06 Proceedings of the 17th European conference on Machine Learning
New prioritized value iteration for Markov decision processes
Artificial Intelligence Review
Topological value iteration algorithms
Journal of Artificial Intelligence Research
Modular value iteration through regional decomposition
AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
A survey of point-based POMDP solvers
Autonomous Agents and Multi-Agent Systems
Hi-index | 0.00 |
The performance of value and policy iteration can be dramatically improved by eliminating redundant or useless backups, and by backing up states in the right order. We study several methods designed to accelerate these iterative solvers, including prioritization, partitioning, and variable reordering. We generate a family of algorithms by combining several of the methods discussed, and present extensive empirical evidence demonstrating that performance can improve by several orders of magnitude for many problems, while preserving accuracy and convergence guarantees.