Computing optimal policies for partially observable decision processes using compact representations

Authors:
Craig Boutilier;David Poole
Affiliations:
Department of Computer Science, University of British Columbia, Vancouver, BC, Canada;Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
Venue:
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Year:
1996

Citing 12
Cited 46

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A model for reasoning about persistence and causation

Computational Intelligence
Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Planning and control

Planning and control
Technical Note: \cal Q-Learning

Machine Learning
Probabilistic Horn abduction and Bayesian networks

Artificial Intelligence
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Dynamic Programming

Dynamic Programming
Optimal Policies for Partially Observable Markov Decision Processes

Optimal Policies for Partially Observable Markov Decision Processes
Approximating optimal policies for partially observable stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Exploiting the rule structure for decision making within the independent choice logic

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Contingent planning under uncertainty via stochastic satisfiability

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Value Iteration over Belief Subspace

ECSQARU '01 Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Progressive Planning for Mobile Robots (A Progress Report)

Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents,
Contingent planning under uncertainty via stochastic satisfiability

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Reinforcement Learning with Factored States and Actions

The Journal of Machine Learning Research
Exploiting belief bounds: practical POMDPs for personal assistant agents

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Heuristic anytime approaches to stochastic decision processes

Journal of Heuristics
Continual planning and acting in dynamic multiagent environments

PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Partially observable Markov decision processes with imprecise parameters

Artificial Intelligence
Selecting treatment strategies with dynamic limited-memory influence diagrams

Artificial Intelligence in Medicine
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
An online multi-agent co-operative learning algorithm in POMDPs

Journal of Experimental & Theoretical Artificial Intelligence
Graphical models for interactive POMDPs: representations and solutions

Autonomous Agents and Multi-Agent Systems
Probabilistic planning with clear preferences on missing information

Artificial Intelligence
A tractable hybrid ddn–pomdp approach to affective dialogue modeling for probabilistic frame-based dialogue systems

Natural Language Engineering
Compact, convex upper bound iteration for approximate POMDP planning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Planning in models that combine memory with predictive representations of state

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Exploiting symmetries in POMDPs for point-based algorithms

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Symbolic heuristic search value iteration for factored POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Restricted value iteration: theory and algorithms

Journal of Artificial Intelligence Research
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
Dynamic non-Bayesian decision making

Journal of Artificial Intelligence Research
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
Conditional progressive planning under uncertainty

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Journal of Algorithms
Knowledge representation for stochastic decision processes

Artificial intelligence today
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Efficient planning in large POMDPs through policy graph based factorized approximations

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Adaptive decision support for structured organizations: a case for OrgPOMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Identifying and exploiting weak-information inducing actions in solving POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
A partition-based first-order probabilistic logic to represent interactive beliefs

SUM'11 Proceedings of the 5th international conference on Scalable uncertainty management
Decision Support in Organizations: A Case for OrgPOMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Value-directed belief state approximation for POMDPs

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Vector-space analysis of belief-state approximation for POMDPs

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Value-directed sampling methods for monitoring POMDPs

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Planning and acting under uncertainty: a new model for spoken dialogue systems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
A framework for decision-theoretic planning I: combining the situation calculus, conditional plans, probability and utility

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
An online POMDP algorithm used by the policeforce agents in the robocuprescue simulation

RoboCup 2005
Algorithms and limits for compact plan representations

Journal of Artificial Intelligence Research
Recognizing internal states of other agents to anticipate and coordinate interactions

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Tractable POMDP representations for intelligent tutoring systems

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially-observable Markov decision processes provide a general model for decision theoretic planning problems, allowing trade-offs between various courses of actions to be determined under conditions of uncertainty, and incorporating partial observations made by an agent. Dynamic programming algorithms based on the belief state of an agent can be used to construct optimal policies without explicit consideration of past history, but at high computational cost. In this paper, we discuss how structured representations of system dynamics can be incorporated in classic POMDP solution algorithms. We use Bayesian networks with structured conditional probability matrices to represent POMDPs, and use this model to structure the belief space for POMDP algorithms, allowing irrelevant distinctions to be ignored. Apart from speeding up optimal policy construction, we suggest that such representations can be exploited in the development of useful approximation methods.