Multi-policy optimization in self-organizing systems

  • Authors:
  • Ivana Dusparic;Vinny Cahill

  • Affiliations:
  • Lero-The Irish Software Engineering Research Centre, Distributed Systems Group, School of Computer Science and Statistics, Trinity College Dublin;Lero-The Irish Software Engineering Research Centre, Distributed Systems Group, School of Computer Science and Statistics, Trinity College Dublin

  • Venue:
  • SOAR'09 Proceedings of the First international conference on Self-organizing architectures
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Self-organizing systems are often implemented as collections of collaborating agents. Such agents may need to optimize their own performance according to multiple policies as well as contribute to the optimization of overall system performance towards a potentially different set of policies. These policies can be heterogeneous, i.e., be implemented on different sets of agents, be active at different times and have different levels of priority, leading to the heterogeneity of the agents of which the system is composed. Numerous biologically-inspired techniques as well as techniques from artificial intelligence have been used to implement such self-organizing systems. In this paper we review the most commonly used techniques for multi-policy optimization in such systems, specifically, those based on ant colony optimization, evolutionary algorithms, particle swarm optimization and reinforcement learning (RL). We analyze the characteristics and existing applications of the reviewed algorithms, assessing their suitability for particular types of optimization problems, based on the environment and policy characteristics. We focus on RL, as it is considered particularly suitable for large-scale self-organizing systems due to its ability to take into account the long-term consequences of the actions executed. Therefore, RL enables the system to learn not only the immediate payoffs of its actions, but also the best actions for the long-term performance of the system. Existing RL implementations mostly focus on optimization towards a single system policy, while most multi-policy RL-based optimization techniques have so far been implemented only on a single agent. We argue that, in order to be more widely utilized as a technique for self-optimization, RL needs to address both multiple policies and multiple agents simultaneously, and analyze the challenges associated with extending existing or developing new RL optimization techniques.