Analysis and solution of a predator-protector-prey multi-robot system by a high-level reinforcement learning architecture and the adaptive systems theory

  • Authors:
  • José Antonio Martín H.;Javier de Lope;Darío Maravall

  • Affiliations:
  • Dep. Sistemas Informáticos y Computación, Universidad Complutense de Madrid, Spain;Dept. Applied Intelligent Systems, Universidad Politécnica de Madrid, Spain;Dept. Artificial Intelligence, Universidad Politécnica de Madrid, Spain

  • Venue:
  • Robotics and Autonomous Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The area of competitive robotic systems usually leads to highly complicated strategies that must be achieved by complex learning architectures since analytic solutions are unpractical or completely unfeasible. In this work we design an experiment in order to study and validate a model about the complex phenomena of adaptation. In particular, we study a reinforcement learning problem that comprises a complex predator-protector-prey system composed by three different robots: a pure bio-mimetic reactive (in Brook's sense, i.e. without reasoning and representation) predator-like robot, a protector-like robot with reinforcement learning capabilities and a pure bio-mimetic reactive prey-like robot. From the high-level point of view, we are interested in studying whether the Law of Adaptation is useful enough to model and explain the whole learning process occurring in this multi-robot system. From a low-level point of view, our interest is in the design of a learning system capable of solving such a complex competitive predator-protector-prey system optimally. We show how this learning problem can be addressed and solved effectively by means of a reinforcement learning setup that uses abstract actions to select a goal or target towards which a pure bio-mimetic reactive robot must navigate. The experimental results clearly show how the Law of Adaptation fits this complex learning system and that the proposed Reinforcement Learning setup is able to find an optimal policy to control the defender robot in its role of protecting the prey against the predator robot.