Point-based value iteration: an anytime algorithm for POMDPs

Authors:
Joelle Pineau;Geoff Gordon;Sebastian Thrun
Affiliations:
Carnegie Mellon University, Robotics Institute, Pittsburgh, PA;Carnegie Mellon University, Robotics Institute, Pittsburgh, PA;Carnegie Mellon University, Robotics Institute, Pittsburgh, PA
Venue:
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Year:
2003

Citing 8
Cited 112

Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
PEGASUS: A policy search method for large MDPs and POMDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Reinforcement Learning in POMDP's via Direct Gradient Ascent

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An epsilon-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Locating moving entities in indoor environments with teams of mobile robots

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research

Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
A Navigation System for Assistant Robots Using Visually Augmented POMDPs

Autonomous Robots
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Region-based value iteration for partially observable Markov decision processes

ICML '06 Proceedings of the 23rd international conference on Machine learning
Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Partially observable Markov decision processes for spoken dialog systems

Computer Speech and Language
Cost-sensitive feature acquisition and classification

Pattern Recognition
Real-time hierarchical POMDPs for autonomous robot navigation

Robotics and Autonomous Systems
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Partially observable Markov decision processes with imprecise parameters

Artificial Intelligence
A novel orthogonal NMF-based belief compression for POMDPs

Proceedings of the 24th international conference on Machine learning
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Dynamics based control with an application to area-sweeping problems

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Reasoning for a multi-modal service robot considering uncertainty in human-robot interaction

Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Anytime similarity measures for faster alignment

Computer Vision and Image Understanding
Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs

Proceedings of the 25th international conference on Machine learning
The permutable POMDP: fast solutions to POMDPs for preference elicitation

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Value-based observation compression for DEC-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Resilient dynamic power management under uncertainty

Proceedings of the conference on Design, automation and test in Europe
A dynamic decision network framework for online media adaptation in stroke rehabilitation

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Real World Multi-agent Systems: Information Sharing, Coordination and Planning

Logic, Language, and Computation
An online multi-agent co-operative learning algorithm in POMDPs

Journal of Experimental & Theoretical Artificial Intelligence
Spoken language interaction with model uncertainty: an adaptive human-robot interaction system

Connection Science - Language and Robots
United We Stand: Population Based Methods for Solving Unknown POMDPs

Recent Advances in Reinforcement Learning
A bayesian reinforcement learning approach for customizing human-robot interfaces

Proceedings of the 14th international conference on Intelligent user interfaces
Probabilistic planning with clear preferences on missing information

Artificial Intelligence
A tractable hybrid ddn–pomdp approach to affective dialogue modeling for probabilistic frame-based dialogue systems

Natural Language Engineering
Training a real-world POMDP-based dialogue system

NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Constraint-based dynamic programming for decentralized POMDPs with structured interactions

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Achieving goals in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
An Uncertainty-Based Belief Selection Method for POMDP Value Iteration

ECSQARU '09 Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Incremental least squares policy iteration for POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Compact, convex upper bound iteration for approximate POMDP planning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Improving approximate value iteration using memories and predictive state representations

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
PPCP: efficient probabilistic planning with clear preferences in partially-known environments

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Rover science autonomy: probabilistic planning for science-aware exploration doctoral consortium thesis summary

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Indefinite-horizon POMDPs with action-based termination

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Point-based policy iteration

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Scaling up: solving POMDPs through value based clustering

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Optimizing anthrax outbreak detection using reinforcement learning

IAAI'07 Proceedings of the 19th national conference on Innovative applications of artificial intelligence - Volume 2
The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management

Computer Speech and Language
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Restricted value iteration: theory and algorithms

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
Policy iteration for decentralized control of Markov decision processes

Journal of Artificial Intelligence Research
Improving anytime point-based value iteration using principled point selections

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A hybridized planner for stochastic domains

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Forward search value iteration for POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Solving POMDPs with continuous or large discrete observation spaces

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Using core beliefs for point-based value iteration

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Topological order planner for POMDPs

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Robot path planning in uncertain environments based on particle swarm optimization

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
k-nearest neighbor Monte-Carlo control algorithm for POMDP-based dialogue systems

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Operations Research
Modeling POMDPs for generating and simulating stock investment policies

Proceedings of the 2010 ACM Symposium on Applied Computing
Automated handwashing assistance for persons with dementia using video and a partially observable Markov decision process

Computer Vision and Image Understanding
Point-based planning for predictive state representations

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
A stochastic point-based algorithm for POMDPs

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
Closing the learning-planning loop with predictive state representations

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
Evaluating point-based POMDP solvers on multicore machines

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on gait analysis
Controlling listening-oriented dialogue using partially observable Markov decision processes

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Planning in partially-observable switching-mode continuous domains

Annals of Mathematics and Artificial Intelligence
A new graphical recursive pruning method for the incremental pruning algorithm

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
A new pruning method for incremental pruning algorithm using a sweeping scan-line through the belief space

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
Learning the behavior model of a robot

Autonomous Robots
POMDP solving: what rewards do you really expect at execution?

Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
Motion planning under uncertainty for robotic tasks with long time horizons

International Journal of Robotics Research
Accelerating point-based POMDP algorithms via greedy strategies

SIMPAR'10 Proceedings of the Second international conference on Simulation, modeling, and programming for autonomous robots
Planning under the uncertainty of the technical analysis of stock markets

IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
Uncertainty-aware dynamic power management in partially observable domains

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

Neural Processing Letters
Closing the learning-planning loop with predictive state representations

International Journal of Robotics Research
Efficient planning under uncertainty with macro-actions

Journal of Artificial Intelligence Research
Learning dialogue POMDP models from data

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

The Journal of Machine Learning Research
Aircraft Collision Avoidance Using Monte Carlo Real-Time Belief Space Search

Journal of Intelligent and Robotic Systems
A Bayesian nonparametric approach to modeling motion patterns

Autonomous Robots
Decision Support in Organizations: A Case for OrgPOMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Prioritizing point-based POMDP solvers

ECML'06 Proceedings of the 17th European conference on Machine Learning
An online POMDP algorithm used by the policeforce agents in the robocuprescue simulation

RoboCup 2005
Using rewards for belief state updates in partially observable markov decision processes

ECML'05 Proceedings of the 16th European conference on Machine Learning
Active learning in partially observable markov decision processes

ECML'05 Proceedings of the 16th European conference on Machine Learning
Sequential decision making under uncertainty

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Real-Time decision making for large POMDPs

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Belief selection in point-based planning algorithms for POMDPs

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Artificial Intelligence
Intention-aware planning under uncertainty for interacting with self-interested, boundedly rational agents

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Delayed observation planning in partially observable domains

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Finding patterns in an unknown graph

AI Communications - The Symposium on Combinatorial Search
Learning observation models for dialogue POMDPs

Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
Estimation, planning, and mapping for autonomous flight using an RGB-D camera in GPS-denied environments

International Journal of Robotics Research
Observer effect from stateful resources in agent sensing

Autonomous Agents and Multi-Agent Systems
Tractable POMDP representations for intelligent tutoring systems

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
A survey of point-based POMDP solvers

Autonomous Agents and Multi-Agent Systems
Applying POMDP to moving target optimization

Proceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop
Linear fitted-Q iteration with multiple reward functions

The Journal of Machine Learning Research
Learning to control listening-oriented dialogue using partially observable markov decision processes

ACM Transactions on Speech and Language Processing (TSLP)
A dialogue system for multimodal human-robot interaction

Proceedings of the 15th ACM on International conference on multimodal interaction
A general framework for interacting bayes-optimally with self-interested agents using arbitrary parametric model and model prior

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Interactive POMDP lite: towards practical planning to predict and exploit intentions for interacting with self-interested agents

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy

Pervasive and Mobile Computing
Survey Control: A perspective

Automatica (Journal of IFAC)
Gaussian Processes for POMDP-Based Dialogue Manager Optimization

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
MineralMiner: An active sensing simulation environment

Multiagent and Grid Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points and then tracking the value and its derivative for those points only. By using stochastic trajectories to choose belief points, and by maintaining only one value hyper-plane per point, PBVI successfully solves large problems: we present results on a robotic laser tag problem as well as three test domains from the literature.