Finding approximate POMDP solutions through belief compression

Authors:
Nicholas Roy;Geoffrey Gordon;Sebastian Thrun
Affiliations:
Computer Science and Arti cial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;Computer Science Department, Stanford University, Stanford, CA
Venue:
Journal of Artificial Intelligence Research
Year:
2005

Citing 21
Cited 30

Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
GTM: the generative topographic mapping

Neural Computation
CONDENSATION—Conditional Density Propagation forVisual Tracking

International Journal of Computer Vision
Robust Monte Carlo localization for mobile robots

Artificial Intelligence
Variable Resolution Discretization in Optimal Control

Machine Learning
Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning

ECP '97 Proceedings of the 4th European Conference on Planning: Recent Advances in AI Planning
Algorithms for partially observable markov decision processes

Algorithms for partially observable markov decision processes
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
Learning geometrically-constrained hidden Markov models for robot navigation: bridging the topological-geometrical gap

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
An improved grid-based approximation algorithm for POMDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
A heuristic variable grid solution method for POMDPs

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Stochastic simulation algorithms for dynamic probabilistic networks

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Policy-contingent abstraction for robust robot control

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

ICML '05 Proceedings of the 22nd international conference on Machine learning
Integrating Value-Directed Compression and Belief Space Analysis for POMDP Decomposition

IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
A novel orthogonal NMF-based belief compression for POMDPs

Proceedings of the 24th international conference on Machine learning
An integrated particle filter and potential field method applied to cooperative multi-robot target tracking

Autonomous Robots
Efficient Multi-robot Search for a Moving Target

International Journal of Robotics Research
Training a real-world POMDP-based dialogue system

NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Partially Observable Markov Decision Process Approximations for Adaptive Sensing

Discrete Event Dynamic Systems
Compact, convex upper bound iteration for approximate POMDP planning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
A frame-based probabilistic framework for spoken dialog management using dialog examples

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Monte Carlo sampling methods for approximating interactive POMDPs

Journal of Artificial Intelligence Research
Nonmyopic adaptive informative path planning for multiple robots

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
On Compressibility and Acceleration of Orthogonal NMF for POMDP Compression

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Probabilistic action planning for active scene modeling in continuous high-dimensional domains

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Improving POMDP tractability via belief compression and clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Visual search for an object in a 3D environment using a mobile robot

Computer Vision and Image Understanding
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
Representing uncertainty about complex user goals in statistical dialogue systems

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Motion planning under uncertainty for robotic tasks with long time horizons

International Journal of Robotics Research
Identifying and exploiting weak-information inducing actions in solving POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Search and pursuit-evasion in mobile robotics

Autonomous Robots
Survey of Motion Planning Literature in the Presence of Uncertainty: Considerations for UAV Guidance

Journal of Intelligent and Robotic Systems
Nonparametric semi-supervised learning for network intrusion detection: combining performance improvements with realistic in-situ training

Proceedings of the 5th ACM workshop on Security and artificial intelligence
Decentralized multi-robot cooperation with auctioned POMDPs

International Journal of Robotics Research
Planning for multiple measurement channels in a continuous-state POMDP

Annals of Mathematics and Artificial Intelligence
Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy

Pervasive and Mobile Computing
Poisson Noise Reduction with Non-local PCA

Journal of Mathematical Imaging and Vision
Cycle slip detection and repair with a circular on-line change-point detector

Signal Processing
Point-based online value iteration algorithm in large POMDP

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, low-dimensional subspace embedded in the high-dimensional belief space. Finding a good approximation to the optimal value function for only this subspace can be much easier than computing the full value function. We introduce a new method for solving large-scale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta, & Schapire, 2002) to represent sparse, high-dimensional belief spaces using small sets of learned features of the belief state. We then plan only in terms of the low-dimensional belief features. By planning in this low-dimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks.