Multi-agent reward analysis for learning in noisy domains

  • Authors:
  • Adrian Agogino;Kagan Turner

  • Affiliations:
  • UC Santa Cruz, Moffett Field, CA;NASA Ames Research Center, Moffett Field, CA

  • Venue:
  • Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many multi-agent learning problems, it is difficult to determine, a priori, the agent reward structure that will lead to good performance. This problem is particularly pronounced in continuous, noisy domains ill-suited to simple table backup schemes commonly used in TD(λ)/Q-learning. In this paper, we present a new reward evaluation method that provides a visualization of the tradeoff between coordination among the agents and the difficulty of the learning problem each agent faces. This method is independent of the learning algorithm and is only a function of the problem domain and the agents' reward structure. We then use this reward property visualization method to determine an effective reward without performing extensive simulations. We test this method in both a static and a dynamic multi-rover learning domain where the agents have continuous state spaces and where their actions are noisy (e.g., the agents' movement decisions are not always carried out properly). Our results show that in the more difficult dynamic domain, the reward efficiency visualization method provides a two order of magnitude speedup in selecting a good reward. Most importantly it allows one to quickly create and verify rewards tailored to the observational limitations of the domain.