The Complexity of Decentralized Control of Markov Decision Processes
Mathematics of Operations Research
Learning to Communicate and Act Using Hierarchical Reinforcement Learning
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
EESR '05 Proceedings of the 2005 workshop on End-to-end, sense-and-respond systems, applications and services
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
The Journal of Machine Learning Research
Solving large TÆMS problems efficiently by selective exploration and decomposition
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
An Application of Automated Negotiation to Distributed Task Allocation
IAT '07 Proceedings of the 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Controlling deliberation in a Markov decision process-based agent
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Learning of coordination: exploiting sparse interactions in multiagent systems
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Artificial Intelligence - Special issue: Distributed constraint satisfaction
Multiagent Meta-level Control for a Network of Weather Radars
WI-IAT '10 Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Hi-index | 0.00 |
Learning consistent policies in decentralized settings is often problematic. The agents have a myopic view of their neighboring states that could lead to inconsistent action choices. The fundamental question addressed in this work is how to determine and obtain the minimal overlapping context among decentralized decision makers required to make their decisions more consistent. Our approach is a two-phased learning process where agents first learn their policies offline within the context of a simplified environment where it is not necessary to know detailed context information about neighbors. These local policies are then applied in more complex "real" environments where it is expected that agents will encounter a much higher rate of inconsistencies (conflicts) with neighborhood actions. When conflicts are observed, agents switch to "special" states that augment local policy states with additional non-local state information and learn other actions to take in this specific situation. This results in action choices that are less likely to lead to conflicts. We evaluate our approach by addressing meta-level decisions in a complex multiagent weather tracking domain. Experimental results show that our approach achieves good performance on utility and conflict resolution by exploring only a small fraction of the whole search space.