Solving POMDPs by searching the space of finite policies

Authors:
Nicolas Meuleau;Kee-Eung Kim;Leslie Pack Kaelbling;Anthony R. Cassandra
Affiliations:
Computer Science Dept, Brown University, Providence, RI;Computer Science Dept, Brown University, Providence, RI;Computer Science Dept, Brown University, Providence, RI;Computer Science Dept, Brown University, Providence, RI
Venue:
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Year:
1999

Citing 16
Cited 29

Memoryless policies: theoretical limitations and practical results

SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
HQ-learning

Adaptive Behavior
An improved policy iteration algorithm for partially observable MDPs

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Gradient descent for general reinforcement learning

Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning Policies with External Memory

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Planning and control in stochastic domains with imperfect information

Planning and control in stochastic domains with imperfect information
Finite-memory control of partially observable systems

Finite-memory control of partially observable systems
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Exploiting belief bounds: practical POMDPs for personal assistant agents

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Solving POMDPs using quadratically constrained linear programs

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Selecting treatment strategies with dynamic limited-memory influence diagrams

Artificial Intelligence in Medicine
A Near Optimal Policy for Channel Allocation in Cognitive Radio

Recent Advances in Reinforcement Learning
Pattern Learning and Decision Making in a Photovoltaic System

SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Point-based policy iteration

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Solving POMDPs using quadratically constrained linear programs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Towards efficient computation of error bounded solutions in POMDPs: expected value approximation and dynamic disjunctive beliefs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Complexity of probabilistic planning under average rewards

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Delay-optimal distributed power and transmission threshold control for S-ALOHA network with FSMC fading channels

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
Delay-sensitive distributed power and transmission threshold control for S-ALOHA network with finite state Markov fading channels

IEEE Transactions on Wireless Communications
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Operations Research
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
Multiscale Adaptive Agent-Based Management of Storage-Enabled Photovoltaic Facilities

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Analyzing and escaping local optima in planning as inference for partially observable domains

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
End-to-end transmission control by modeling uncertainty about the network state

Proceedings of the 10th ACM Workshop on Hot Topics in Networks
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning
On the Computational Complexity of Stochastic Controller Optimization in POMDPs

ACM Transactions on Computation Theory (TOCT)
Recognizing internal states of other agents to anticipate and coordinate interactions

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Isomorph-free branch and bound search for finite state controllers

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Solving partially observable Markov decision processes (POMDPS) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size. This problem is also intractable, but we show that the complexity can be greatly reduced when the POMDP andlor policy are further constrained. We demonstrate good empirical results with a branch-and-bound method for finding globally optimal deterministic policies, and a gradient-ascent method for finding locally optimal stochastic policies.