Joint Equilibrium Policy Search for Multi-Agent Scheduling Problems

Authors:
Thomas Gabel;Martin Riedmiller
Affiliations:
Neuroinformatics Group Department of Mathematics and Computer Science Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany 49069;Neuroinformatics Group Department of Mathematics and Computer Science Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany 49069
Venue:
MATES '08 Proceedings of the 6th German conference on Multiagent System Technologies
Year:
2008

Citing 6
Cited 0

The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Complex Scheduling (GOR-Publications)

Complex Scheduling (GOR-Publications)
Reinforcement learning for DEC-MDPs with changing action sets and partially ordered dependencies

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Scheduling: Theory, Algorithms, and Systems

Scheduling: Theory, Algorithms, and Systems
Learning to Coordinate Efficiently: a model-based approach

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose joint equilibrium policy search as a multi-agent learning algorithm for decentralized Markov decision processes with changing action sets. In its basic form, it relies on stochastic agent-specific policies parameterized by probability distributions defined for every state as well as on a heuristic that tells whether a joint equilibrium could be obtained. We also suggest an extended version where each agent employs a global policy parameterization which renders the approach applicable to larger-scale problems. Joint-equilibrium policy search is well suited for production planning, traffic control, and other application problems. In support of this, we apply our algorithms to a number of challenging scheduling benchmark problems, finding that solutions of very high quality can be obtained.