Automatic creation of an autonomous agent: genetic evolution of a neural-network driven robot
SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Online Interactive Neuro-evolution
Neural Processing Letters
Coordination and Learning in Multirobot Systems
IEEE Intelligent Systems
Cellular Encoding Applied to Neurocontrol
Proceedings of the 6th International Conference on Genetic Algorithms
Collective and Cooperative Group Behaviors: Biologically Inspired Experiments in Robotics
The 4th International Symposium on Experimental Robotics IV
Collective Intelligence and Braess' Paradox
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Efficient Reinforcement Learning Through Evolving Neural Network Topologies
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Evolving mobile robots able to display collective behaviors
Artificial Life
Collectives and Design Complex Systems
Collectives and Design Complex Systems
Active guidance for a finless rocket using neuroevolution
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Ant colony system: a cooperative learning approach to the traveling salesman problem
IEEE Transactions on Evolutionary Computation
Efficient evaluation functions for evolving coordination
Evolutionary Computation
An overview of cooperative and competitive multiagent learning
LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Hi-index | 0.00 |
This chapter focuses on deriving reward functions that allow multiple agents to co-evolve efficient control policies that maximize a system level reward in noisy and dynamic environments. The solution we present is based on agent rewards satisfying two crucial properties. First, the agent reward function and global reward function has to be aligned, that is, an agent maximizing its agent-specific reward should also maximize the global reward. Second, the agent has to receive sufficient “signal” from its reward, that is, an agent's action should have a large influence over its agent-specific reward. Agents using rewards with these two properties will evolve the correct policies quickly. This hypothesis is tested in episodic and non-episodic, continuous-space multi-rover environment where rovers evolve to maximize a global reward function over all rovers. The environments are dynamic (i.e. changes over time), noisy and have restriction on communication between agents. We show that a control policy evolved using agent-specific rewards satisfying the above properties outperforms policies evolved using global rewards by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global reward by a higher percentage than in noise-free conditions with a small number of rovers.