The Complexity of Decentralized Control of Markov Decision Processes
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Optimizing information exchange in cooperative multi-agent systems
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Communication for Improving Policy Computation in Distributed POMDPs
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Reasoning about joint beliefs for execution-time communication decisions
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Not all agents are equal: scaling up distributed POMDPs for agent networks
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
The communicative multiagent team decision problem: analyzing teamwork theories and models
Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Bounded policy iteration for decentralized POMDPs
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
Distributed Partially Observable Markov Decision Problems (Dis-POMDPs) are emerging as a popular approach for modeling sequential decision making in teams operating under uncertainty. To achieve coherent behaviors of agents, it is essential to perform appropriate run-time communication. Thus, there have been many works on the run-time communication schemes in Dis-POMDPs. Also, a Finite State Machine (FSM) is a popular representation for describing a local policy that works in a very long or infinite time horizon. In this paper, we examine a run-time communication scheme when the local policy of each agent is represented as an FSM. In this scheme, the meaning of each message is not predefined; it is given implicitly by the interaction between local policies. We propose an iterative-improvement type algorithm that searches for a joint policy where run-time communication incurs some cost. Thus, agents use runtime communication only when doing so is cost-effective. Interestingly, our algorithm can find a joint policy that obtains a better expected reward than a hand-crafted joint policy, and it requires fewer nodes in the local FSM and fewer message types. Furthermore, we experimentally show that our algorithm can obtain a joint policy that consists of sufficiently complex local FSMs within a reasonable amount of time.