Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
The Complexity of Decentralized Control of Markov Decision Processes
Mathematics of Operations Research
Transition-independent decentralized markov decision processes
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Planning, learning and coordination in multiagent decision processes
TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Formal models and algorithms for decentralized decision making under uncertainty
Autonomous Agents and Multi-Agent Systems
Lossless clustering of histories in decentralized POMDPs
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Dynamic programming for partially observable stochastic games
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Optimal and approximate Q-value functions for decentralized POMDPs
Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Point-based policy generation for decentralized POMDPs
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based backup for decentralized POMDPs: complexity and new algorithms
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Optimal decentralized decision making in a team of cooperative agents as formalized by decentralized POMDPs is a notoriously hard problem. A major obstacle is that the agents do not have access to a sufficient statistic during execution, which means that they need to base their actions on their histories of observations. A consequence is that even during off-line planning the choice of decision rules for different stages is tightly interwoven: decisions of earlier stages affect how to act optimally at later stages, and the optimal value function for a stage is known to have a dependence on the decisions made up to that point. This paper makes a contribution to the theory of decentralized POMDPs by showing how this dependence on the 'past joint policy' can be replaced by a sufficient statistic. These results are extended to the case of k-step delayed communication. The paper investigates the practical implications, as well as the effectiveness of a new pruning technique for MAA* methods, in a number of benchmark problems and discusses future avenues of research opened by these contributions.