CLEAN rewards for improving multiagent coordination in the presence of exploration

Authors:
Chris HolmesParker;Adrian Agogino;Kagan Tumer
Affiliations:
Oregon State University, Corvallis, Oregon, USA;NASA Ames Research Center, Moffett Field, Oregon, USA;Oregon State University, Corvallis, Oregon, USA
Venue:
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Year:
2013

Citing 4
Cited 0

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
Reward shaping for valuing communications during multi-agent coordination

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Evolving large scale UAV communication system

Proceedings of the 14th annual conference on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In cooperative multiagent systems, coordinating the joint-actions of agents is difficult. One of the fundamental difficulties in such multiagent systems is the slow learning process where an agent may not only need to learn how to behave in a complex environment, but may also need to account for the actions of the other learning agents. Here, the inability of agents to distinguish the true environmental dynamics from those caused by the stochastic exploratory actions of other agents creates noise on each agent's reward signal. To address this, we introduce Coordinated Learning without Exploratory Action Noise (CLEAN) rewards, which are agent-specific shaped rewards that effectively remove such learning noise from each agent's reward signal. We demonstrate their performance with up to 1000 agents in a standard congestion problem.