Efficient reward functions for adaptive multi-rover systems

Authors:
Kagan Tumer;Adrian Agogino
Affiliations:
NASA Ames Research Center, Moffet Field, CA;UC Santa Cruz, NASA Ames Research Center, Moffet Field, CA
Venue:
LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Year:
2005

Citing 11
Cited 2

Automatic creation of an autonomous agent: genetic evolution of a neural-network driven robot

SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Online Interactive Neuro-evolution

Neural Processing Letters
Coordination and Learning in Multirobot Systems

IEEE Intelligent Systems
Cellular Encoding Applied to Neurocontrol

Proceedings of the 6th International Conference on Genetic Algorithms
Collective and Cooperative Group Behaviors: Biologically Inspired Experiments in Robotics

The 4th International Symposium on Experimental Robotics IV
Collective Intelligence and Braess' Paradox

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Efficient Reinforcement Learning Through Evolving Neural Network Topologies

GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Evolving mobile robots able to display collective behaviors

Artificial Life
Collectives and Design Complex Systems

Collectives and Design Complex Systems
Active guidance for a finless rocket using neuroevolution

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Ant colony system: a cooperative learning approach to the traveling salesman problem

IEEE Transactions on Evolutionary Computation

Efficient evaluation functions for evolving coordination

Evolutionary Computation
An overview of cooperative and competitive multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This chapter focuses on deriving reward functions that allow multiple agents to co-evolve efficient control policies that maximize a system level reward in noisy and dynamic environments. The solution we present is based on agent rewards satisfying two crucial properties. First, the agent reward function and global reward function has to be aligned, that is, an agent maximizing its agent-specific reward should also maximize the global reward. Second, the agent has to receive sufficient “signal” from its reward, that is, an agent's action should have a large influence over its agent-specific reward. Agents using rewards with these two properties will evolve the correct policies quickly. This hypothesis is tested in episodic and non-episodic, continuous-space multi-rover environment where rovers evolve to maximize a global reward function over all rovers. The environments are dynamic (i.e. changes over time), noisy and have restriction on communication between agents. We show that a control policy evolved using agent-specific rewards satisfying the above properties outperforms policies evolved using global rewards by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global reward by a higher percentage than in noise-free conditions with a small number of rovers.