P3VI: a partitioned, prioritized, parallel value iterator

Authors:
David Wingate;Kevin D. Seppi
Affiliations:
Brigham Young University, Provo, UT;Brigham Young University, Provo, UT
Venue:
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Year:
2004

Citing 7
Cited 3

Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Variable Resolution Discretization in Optimal Control

Machine Learning
Multi-way graph and hypergraph partitioning

Multi-way graph and hypergraph partitioning

Prioritized Multiplicative Schwarz Procedures for Solving Linear Systems

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Parallel reinforcement learning with linear function approximation

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Robot-Assisted Needle Steering Using a Control Theoretic Approach

Journal of Intelligent and Robotic Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

We present an examination of the state-of-the-art for using value iteration to solve large-scale discrete Markov Decision Processes. We introduce an architecture which combines three independent performance enhancements (the intelligent prioritization of computation, state partitioning, and massively parallel processing) into a single algorithm. We show that each idea improves performance in a different way, meaning that algorithm designers do not have to trade one improvement for another. We give special attention to parallelization issues, discussing how to efficiently partition states, distribute partitions to processors, minimize message passing and ensure high scalability. We present experimental results which demonstrate that this approach solves large problems in reasonable time.