Heuristic search in restricted memory (research note)
Artificial Intelligence
Acting optimally in partially observable stochastic domains
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
An improved policy iteration algorithm for partially observable MDPs
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning
ECP '97 Proceedings of the 4th European Conference on Planning: Recent Advances in AI Planning
Incremental Markov-Model Planning
ICTAI '96 Proceedings of the 8th International Conference on Tools with Artificial Intelligence
Planning and Acting in Partially Observable Stochastic Domains
Planning and Acting in Partially Observable Stochastic Domains
Efficient dynamic-programming updates in partially observable Markov decision processes
Efficient dynamic-programming updates in partially observable Markov decision processes
Finite-memory control of partially observable systems
Finite-memory control of partially observable systems
A heuristic variable grid solution method for POMDPs
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Incremental methods for computing bounds in partially observable Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Complexity of finite-horizon Markov decision process problems
Journal of the ACM (JACM)
Planning and Control in Artificial Intelligence: A Unifying Perspective
Applied Intelligence
The Complexity of Decentralized Control of Markov Decision Processes
Mathematics of Operations Research
Reactive Navigation Using Reinforment Learning in Situations of POMDPs
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
A POMDP formulation of preference elicitation problems
Eighteenth national conference on Artificial intelligence
Learning diagnostic policies from examples by systematic search
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
An online POMDP algorithm for complex multiagent environments
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Heuristic anytime approaches to stochastic decision processes
Journal of Heuristics
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
Partially observable Markov decision processes with imprecise parameters
Artificial Intelligence
Model-free reinforcement learning as mixture learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Stochastic local search for POMDP controllers
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Dynamic programming for partially observable stochastic games
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Indefinite-horizon POMDPs with action-based termination
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
A variance analysis for POMDP policy evaluation
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes
Journal of Artificial Intelligence Research
Speeding up the convergence of value iteration in partially observable Markov decision processes
Journal of Artificial Intelligence Research
Nonapproximability results for partially observable Markov decision processes
Journal of Artificial Intelligence Research
Finding approximate POMDP solutions through belief compression
Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Integrating learning from examples into the search for diagnostic policies
Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs
Journal of Artificial Intelligence Research
Policy iteration for decentralized control of Markov decision processes
Journal of Artificial Intelligence Research
Solving POMDPs using quadratically constrained linear programs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Complexity of probabilistic planning under average rewards
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
An improved grid-based approximation algorithm for POMDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Bounded policy iteration for decentralized POMDPs
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Conformant plans and beyond: Principles and complexity
Artificial Intelligence
A POMDP approximation algorithm that anticipates the need to observe
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
Autonomous Agents and Multi-Agent Systems
A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems
Neural Processing Letters
HTN-style planning in relational POMDPs using first-order FSCs
KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
My brain is full: when more memory helps
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching the space of finite policies
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning finite-state controllers for partially observable environments
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
The complexity of decentralized control of Markov decision processes
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
A POMDP model for guiding taxi cruising in a congested urban city
MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
The Skyline algorithm for POMDP value function pruning
Annals of Mathematics and Artificial Intelligence
Generalized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
On the Computational Complexity of Stochastic Controller Optimization in POMDPs
ACM Transactions on Computation Theory (TOCT)
A survey of point-based POMDP solvers
Autonomous Agents and Multi-Agent Systems
Producing efficient error-bounded solutions for transition independent decentralized mdps
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Hi-index | 0.00 |
Most algorithms for solving POMDPs iteratively improve a value function that implicitly represents a policy and are said to search in value function space. This paper presents an approach to solving POMDPs that represents a policy explicitly as a finite-state controller and iteratively improves the controller by search in policy space. Two related algorithms illustrate this approach. The first is a policy iteration algorithm that can outperform value iteration in solving infinitehorizon POMDPs. It provides the foundation for a new heuristic search algorithm that promises further speedup by focusing computational effort on regions of the problem space that are reachable, or likely to be reached, from a start state.