Modeling difference rewards for multiagent learning

Authors:
Scott Proper;Kagan Tumer
Affiliations:
Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR
Venue:
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Year:
2012

Citing 3
Cited 1

Distributed agent-based air traffic flow management

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Analyzing and visualizing multiagent rewards in dynamic and stochastic domains

Autonomous Agents and Multi-Agent Systems
Scaling model-based average-reward reinforcement learning for product delivery

ECML'06 Proceedings of the 17th European conference on Machine Learning

Learning potential functions and their representations for multi-task reinforcement learning

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Difference rewards (a particular instance of reward shaping) have been used to allow multiagent domains to scale to large numbers of agents, but they remain difficult to compute in many domains. We present an approach to modeling the global reward using function approximation that allows the quick computation of shaped difference rewards. We demonstrate how this model can result in significant improvements in behavior for two air traffic control problems. We show how the model of the global reward may be either learned on- or off-line using a linear combination of neural networks.