Online planning algorithms for POMDPs

Authors:
Stéphane Ross;Joelle Pineau;Sébastien Paquet;Brahim Chaib-draa
Affiliations:
School of Computer Science, McGill University, Montreal, Canada;School of Computer Science, McGill University, Montreal, Canada;Department of Computer Science and Software Engineering, Laval University, Quebec, Canada;Department of Computer Science and Software Engineering, Laval University, Quebec, Canada
Venue:
Journal of Artificial Intelligence Research
Year:
2008

Citing 31
Cited 34

Principles of artificial intelligence

Principles of artificial intelligence
The complexity of Markov decision processes

Mathematics of Operations Research
Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Planning and acting in partially observable stochastic domains

Artificial Intelligence
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
LAO: a heuristic search algorithm that finds solutions with loops

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Agent-centered search

AI Magazine
Rollout Algorithms for Stochastic Scheduling Problems

Journal of Heuristics
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning

ECP '97 Proceedings of the 4th European Conference on Planning: Recent Advances in AI Planning
Dynamic Programming

Dynamic Programming
Algorithms for sequential decision-making

Algorithms for sequential decision-making
Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes

Discrete Event Dynamic Systems
Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Tractable planning under uncertainty: exploiting structure

Tractable planning under uncertainty: exploiting structure
Exploiting structure to efficiently solve large scale partially observable markov decision processes

Exploiting structure to efficiently solve large scale partially observable markov decision processes
Distributed decision-making and task coordination in dynamic, uncertain and real-time multiagent environments

Distributed decision-making and task coordination in dynamic, uncertain and real-time multiagent environments
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Approximate planning for factored POMDPs using belief state simplification

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

A POMDP framework for coordinated guidance of autonomous UAVs for multitarget tracking

EURASIP Journal on Advances in Signal Processing - Special issue on signal processing advances in robots and autonomy
Monte Carlo sampling methods for approximating interactive POMDPs

Journal of Artificial Intelligence Research
Information-lookahead planning for AUV mapping

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Probabilistic action planning for active scene modeling in continuous high-dimensional domains

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
BigList: speech-based selection of items from huge lists

SSIP '09/MIV'09 Proceedings of the 9th WSEAS international conference on signal, speech and image processing, and 9th WSEAS international conference on Multimedia, internet & video technologies
Accelerating point-based POMDP algorithms via greedy strategies

SIMPAR'10 Proceedings of the Second international conference on Simulation, modeling, and programming for autonomous robots
System interdependence analysis for autonomous robots

International Journal of Robotics Research
Efficient planning under uncertainty with macro-actions

Journal of Artificial Intelligence Research
Partial evaluation for planning in multiagent expedition

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

The Journal of Machine Learning Research
Faster teaching by POMDP planning

AIED'11 Proceedings of the 15th international conference on Artificial intelligence in education
Aircraft Collision Avoidance Using Monte Carlo Real-Time Belief Space Search

Journal of Intelligent and Robotic Systems
A Bayesian nonparametric approach to modeling motion patterns

Autonomous Robots
Planning for mechatronics systems-Architecture, methods and case study

Engineering Applications of Artificial Intelligence
Adaptive submodularity: theory and applications in active learning and stochastic optimization

Journal of Artificial Intelligence Research
Multi-agent framework for real-time processing of large and dynamic search spaces

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Exploiting probabilistic knowledge under uncertain sensing for efficient robot behaviour

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Active exploration for robust object detection

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Active visual sensing and collaboration on mobile robots using hierarchical POMDPs

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Evaluating POMDP rewards for active perception

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
A comparative study of reinforcement learning techniques on dialogue management

EACL '12 Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics
Observer effect from stateful resources in agent sensing

Autonomous Agents and Multi-Agent Systems
Optimal learning of transition probabilities in the two-agent newsvendor problem

Proceedings of the Winter Simulation Conference
Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach

Artificial Intelligence in Medicine
Hybrid POMDP based evolutionary adaptive framework for efficient visual tracking algorithms

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Potential-based reward shaping for POMDPs

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Planning for multiple measurement channels in a continuous-state POMDP

Annals of Mathematics and Artificial Intelligence
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Bimodal switching for online planning in multiagent settings

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Run-time improvement of point-based POMDP policies

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
MineralMiner: An active sensing simulation environment

Multiagent and Grid Systems
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research
Scalable and efficient bayes-adaptive reinforcement learning based on monte-carlo tree search

Journal of Artificial Intelligence Research
Point-based online value iteration algorithm in large POMDP

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.