Value-function approximations for partially observable Markov decision processes

Authors:
Milos Hauskrecht
Affiliations:
Computer Science Department, Brown University, Providence, RI
Venue:
Journal of Artificial Intelligence Research
Year:
2000

Citing 43
Cited 59

Depth-first iterative-deepening: an optimal admissible tree search

Artificial Intelligence
The complexity of Markov decision processes

Mathematics of Operations Research
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A model for reasoning about persistence and causation

Computational Intelligence
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Computationally feasible bounds for partially observed Markov decision processes

Operations Research
A survey of algorithmic methods for partially observed Markov decision processes

Annals of Operations Research
A computational scheme for reasoning in dynamic probabilistic networks

UAI '92 Proceedings of the eighth conference on Uncertainty in Artificial Intelligence
Sunoptimal policies, with bounds, for parameter adaptive decision processes

Operations Research
The complexity of stochastic games

Information and Computation
Memoryless policies: theoretical limitations and practical results

SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
A counterexample to temporal differences learning

Neural Computation
An algorithm for probabilistic planning

Artificial Intelligence - Special volume on planning and scheduling
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
On the complexity of partially observed Markov decision processes

Theoretical Computer Science - Special issue on complexity theory and the theory of algorithms as developed in the CIS
Abstraction and approximate decision-theoretic planning

Artificial Intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
An improved policy iteration algorithm for partially observable MDPs

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Exploiting the architecture of dynamic systems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Contour Tracking by Stochastic Propagation of Conditional Density

ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume I - Volume I
Learning Sorting and Decision Trees with POMDPs

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Incremental Markov-Model Planning

ICTAI '96 Proceedings of the 8th International Conference on Tools with Artificial Intelligence
Dynamic Programming

Dynamic Programming
Algorithms for partially observable markov decision processes

Algorithms for partially observable markov decision processes
Algorithms for sequential decision-making

Algorithms for sequential decision-making
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Planning and control in stochastic domains with imperfect information

Planning and control in stochastic domains with imperfect information
Influence Diagrams

Decision Analysis
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
Approximating optimal policies for partially observable stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A heuristic variable grid solution method for POMDPs

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Approximate planning for factored POMDPs using belief state simplification

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Planning with partially observable Markov decision processes: advances in exact solution method

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Stochastic simulation algorithms for dynamic probabilistic networks

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Region-based approximations for planning in stochastic domains

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Value Iteration over Belief Subspace

ECSQARU '01 Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Space-Progressive Value Iteration: An Anytime Algorithm for a Class of POMDPs

ECSQARU '01 Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
A POMDP formulation of preference elicitation problems

Eighteenth national conference on Artificial intelligence
Optimal replacement under partial observations

Mathematics of Operations Research
Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Exploiting belief bounds: practical POMDPs for personal assistant agents

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Heuristic anytime approaches to stochastic decision processes

Journal of Heuristics
Cost-sensitive feature acquisition and classification

Pattern Recognition
Simulation-based optimal sensor scheduling with application to observer trajectory planning

Automatica (Journal of IFAC)
Real-time hierarchical POMDPs for autonomous robot navigation

Robotics and Autonomous Systems
Partially observable Markov decision processes with imprecise parameters

Artificial Intelligence
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
A Continuous Internal-State Controller for Partially Observable Markov Decision Processes

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
An online multi-agent co-operative learning algorithm in POMDPs

Journal of Experimental & Theoretical Artificial Intelligence
A tractable hybrid ddn–pomdp approach to affective dialogue modeling for probabilistic frame-based dialogue systems

Natural Language Engineering
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
Color learning and illumination invariance on mobile robots: A survey

Robotics and Autonomous Systems
Compact, convex upper bound iteration for approximate POMDP planning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Symbolic heuristic search value iteration for factored POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Restricted value iteration: theory and algorithms

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
Towards efficient computation of error bounded solutions in POMDPs: expected value approximation and dynamic disjunctive beliefs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
An improved grid-based approximation algorithm for POMDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Solving POMDPs: RTDP-bel vs. point-based algorithms

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Improving POMDP tractability via belief compression and clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Reinforcement learning for cooperative actions in a partially observable multi-agent system

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
Risk-sensitive planning in partially observable environments

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Flying in the dark: controlling autonomous data ferries with partial observations

Proceedings of the eleventh ACM international symposium on Mobile ad hoc networking and computing
Dynamic Allocation of Pharmaceutical Detailing and Sampling for Long-Term Profitability

Marketing Science
Accelerating point-based POMDP algorithms via greedy strategies

SIMPAR'10 Proceedings of the Second international conference on Simulation, modeling, and programming for autonomous robots
A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

Neural Processing Letters
Qualitative MDPs and POMDPs: an order-of-magnitude approximation

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Vector-space analysis of belief-state approximation for POMDPs

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Planning and acting under uncertainty: a new model for spoken dialogue systems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Feature extraction for decision-theoretic planning in partially observable environments

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Using rewards for belief state updates in partially observable markov decision processes

ECML'05 Proceedings of the 16th European conference on Machine Learning
Logic and model checking for hidden markov models

FORTE'05 Proceedings of the 25th IFIP WG 6.1 international conference on Formal Techniques for Networked and Distributed Systems
Belief selection in point-based planning algorithms for POMDPs

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Implementation techniques for solving POMDPs in personal assistant agents

ProMAS'05 Proceedings of the Third international conference on Programming Multi-Agent Systems
Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Artificial Intelligence
Point-based value iteration for constrained POMDPs

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Recognizing internal states of other agents to anticipate and coordinate interactions

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
A survey of point-based POMDP solvers

Autonomous Agents and Multi-Agent Systems
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Planning for multiple measurement channels in a continuous-state POMDP

Annals of Mathematics and Artificial Intelligence
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Isomorph-free branch and bound search for finite state controllers

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Point-based online value iteration algorithm in large POMDP

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a set of imperfect or noisy observations. The modeling advantage of POMDPs, however, comes at a price -- exact methods for solving them are computationally very expensive and thus applicable in practice only to very simple problems. We focus on efficient approximation (heuristic) methods that attempt to alleviate the computational problem and trade off accuracy for speed. We have two objectives here. First, we survey various approximation methods, analyze their properties and relations and provide some new insights into their differences. Second, we present a number of new approximation methods and novel refinements of existing techniques. The theoretical results are supported by experiments on a problem from the agent navigation domain.