Multi-agent reward analysis for learning in noisy domains

Authors:
Adrian Agogino;Kagan Turner
Affiliations:
UC Santa Cruz, Moffett Field, CA;NASA Ames Research Center, Moffett Field, CA
Venue:
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Year:
2005

Citing 13
Cited 8

Connectionist learning procedures

Artificial Intelligence
Visualizing processes in neural networks

IBM Journal of Research and Development
Adaptivity in agent-based routing for data networks

AGENTS '00 Proceedings of the fourth international conference on Autonomous agents
Gradient descent for general reinforcement learning

Proceedings of the 1998 conference on Advances in neural information processing systems II
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Learning sequences of actions in collectives of autonomous agents

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Coordination and Learning in Multirobot Systems

IEEE Intelligent Systems
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Collective Intelligence and Braess' Paradox

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
The Dynamic Selection of Coordination Mechanisms

Autonomous Agents and Multi-Agent Systems
Collectives and Design Complex Systems

Collectives and Design Complex Systems
Simulation and Visualization of a Market-Based Model for Logistics Management in Transportation

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3

Handling Communication Restrictions and Team Formation in Congestion Games

Autonomous Agents and Multi-Agent Systems
Distributed agent-based air traffic flow management

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Regulating air traffic flow with coupled agents

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Efficient evaluation functions for evolving coordination

Evolutionary Computation
Using two main arguments in agent negotiation

PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
A multiagent approach to managing air traffic flow

Autonomous Agents and Multi-Agent Systems
Evolving distributed resource sharing for cubesat constellations

Proceedings of the 14th annual conference on Genetic and evolutionary computation
A decision-theoretic characterization of organizational influences

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many multi-agent learning problems, it is difficult to determine, a priori, the agent reward structure that will lead to good performance. This problem is particularly pronounced in continuous, noisy domains ill-suited to simple table backup schemes commonly used in TD(λ)/Q-learning. In this paper, we present a new reward evaluation method that provides a visualization of the tradeoff between coordination among the agents and the difficulty of the learning problem each agent faces. This method is independent of the learning algorithm and is only a function of the problem domain and the agents' reward structure. We then use this reward property visualization method to determine an effective reward without performing extensive simulations. We test this method in both a static and a dynamic multi-rover learning domain where the agents have continuous state spaces and where their actions are noisy (e.g., the agents' movement decisions are not always carried out properly). Our results show that in the more difficult dynamic domain, the reward efficiency visualization method provides a two order of magnitude speedup in selecting a good reward. Most importantly it allows one to quickly create and verify rewards tailored to the observational limitations of the domain.