Memoryless policies: theoretical limitations and practical results
SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Acting optimally in partially observable stochastic domains
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Adaptive Behavior
An improved policy iteration algorithm for partially observable MDPs
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Gradient descent for general reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning Policies with External Memory
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement learning with selective perception and hidden state
Reinforcement learning with selective perception and hidden state
Exact and approximate algorithms for partially observable markov decision processes
Exact and approximate algorithms for partially observable markov decision processes
Planning and control in stochastic domains with imperfect information
Planning and control in stochastic domains with imperfect information
Finite-memory control of partially observable systems
Finite-memory control of partially observable systems
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Learning finite-state controllers for partially observable environments
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Complexity of finite-horizon Markov decision process problems
Journal of the ACM (JACM)
The Complexity of Decentralized Control of Markov Decision Processes
Mathematics of Operations Research
An online POMDP algorithm for complex multiagent environments
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Exploiting belief bounds: practical POMDPs for personal assistant agents
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Solving POMDPs using quadratically constrained linear programs
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Selecting treatment strategies with dynamic limited-memory influence diagrams
Artificial Intelligence in Medicine
A Near Optimal Policy for Channel Allocation in Cognitive Radio
Recent Advances in Reinforcement Learning
Pattern Learning and Decision Making in a Photovoltaic System
SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
Stochastic local search for POMDP controllers
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Nonapproximability results for partially observable Markov decision processes
Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Solving POMDPs using quadratically constrained linear programs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Complexity of probabilistic planning under average rewards
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Bounded policy iteration for decentralized POMDPs
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
IEEE Transactions on Wireless Communications
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
Autonomous Agents and Multi-Agent Systems
Multiscale Adaptive Agent-Based Management of Storage-Enabled Photovoltaic Facilities
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Analyzing and escaping local optima in planning as inference for partially observable domains
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
End-to-end transmission control by modeling uncertainty about the network state
Proceedings of the 10th ACM Workshop on Hot Topics in Networks
Learning finite-state controllers for partially observable environments
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs
ECML'05 Proceedings of the 16th European conference on Machine Learning
On the Computational Complexity of Stochastic Controller Optimization in POMDPs
ACM Transactions on Computation Theory (TOCT)
Recognizing internal states of other agents to anticipate and coordinate interactions
EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Isomorph-free branch and bound search for finite state controllers
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Solving partially observable Markov decision processes (POMDPS) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size. This problem is also intractable, but we show that the complexity can be greatly reduced when the POMDP andlor policy are further constrained. We demonstrate good empirical results with a branch-and-bound method for finding globally optimal deterministic policies, and a gradient-ascent method for finding locally optimal stochastic policies.