The grid, the load and the gradient

  • Authors:
  • Amos Brocco

  • Affiliations:
  • Department of Innovative Technologies, Information Systems and Networking Institute, University of Applied Science of Southern Switzerland (SUPSI), Manno, Switzerland 6928

  • Venue:
  • Natural Computing: an international journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important concern for an efficient use of distributed computing is dealing with load balancing to ensure all available nodes and their shared resources are equally exploited. In large scale systems such as volunteer computing platforms and desktop grids, centralized solutions may introduce performance bottlenecks and single points of failure. Accordingly fully distributed alternatives have been considered, due to their inherent robustness and reliability. In extremely dynamic contexts, scheduling middlewares should adapt their job scheduling policies to the actual availability and overcome the volatility and heterogeneity typical of the underlying nodes. To deal with the dynamicity of a large pool of resources, self-organizing and adaptive solutions represent a promising research direction. Solutions based on bio-inspired methodologies are particularly suitable, as they inherently provide the desired features. In this paper we present a fully distributed load balancing mechanism, called ozmos, which aims at increasing the efficiency of distributed computing systems through peer-to-peer interaction between nodes. The proposed algorithm is based on a Chord overlay, and employs ant-like agents to spread information about the current load on each node, to reschedule tasks from overloaded systems to underloaded ones, and to relocate incompatible tasks on suitable resources in heterogeneous grids. By means of several evaluation scenarios we demonstrate the effectiveness of the proposed solution in achieving system-wide load balancing, both with homogeneous and heterogeneous resources. In particular we consider the load balancing performance of our approach, its scalability, as well as its communication efficiency.