Scalable dimensioning of resilient Lambda Grids

  • Authors:
  • Pieter Thysebaert;Marc De Leenheer;Bruno Volckaert;Filip De Turck;Bart Dhoedt;Piet Demeester

  • Affiliations:
  • Ghent University - IBBT - IMEC, Department of Information Technology, Gaston Crommenlaan 8 bus 201, 9050 Gent, Belgium;Ghent University - IBBT - IMEC, Department of Information Technology, Gaston Crommenlaan 8 bus 201, 9050 Gent, Belgium;Ghent University - IBBT - IMEC, Department of Information Technology, Gaston Crommenlaan 8 bus 201, 9050 Gent, Belgium;Ghent University - IBBT - IMEC, Department of Information Technology, Gaston Crommenlaan 8 bus 201, 9050 Gent, Belgium;Ghent University - IBBT - IMEC, Department of Information Technology, Gaston Crommenlaan 8 bus 201, 9050 Gent, Belgium;Ghent University - IBBT - IMEC, Department of Information Technology, Gaston Crommenlaan 8 bus 201, 9050 Gent, Belgium

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grids consist of the aggregation of numerous dispersed computational, storage and network resources, able to satisfy even the most demanding computing jobs. Due to the data-intensive nature of Grid jobs, there is an increasing interest in Grids using optical transport networks as this technology allows for the timely delivery of large amounts of data. Such Grids are commonly referred to as Lambda Grids. An important aspect of Grid deployment is the allocation and activation of installed network capacity, needed to transfer data and jobs to and from remote resources. However, the exact nature of a Grid's network traffic depends on the way arriving workload is scheduled over the various Grid sites. As Grids possibly feature high numbers of resources, jobs and users, solving the combined Grid network dimensioning and workload scheduling problem requires the use of scalable mathematical methods such as Divisible Load Theory (DLT). Lambda Grids feature additional complexity such as wavelength granularity and continuity or conversion constraints must be enforced. Additionally, Grid resources cannot be expected to be available at all times. Therefore, the extra complexity of resilience against possible resource failures must be taken into account when modelling the combined Grid network dimensioning and workload scheduling problem, enforcing the need for scalable solution methods. In this work, we tackle the Lambda Grid combined dimensioning and workload scheduling problem and incorporate single-resource failure or unavailability scenarios. We use Divisible Load Theory to tackle the scalability problem and compare non-resilient lambda Grid dimensioning to the dimensions needed to survive single-resource failures. We distinguish three failure scenarios relevant to lambda Grid deployment: computational element, network link and optical cross-connect failure. Using regular network topologies, we derive analytical bounds on the dimensioning cost. To validate these bounds, we present comparisons for the resulting Grid dimensions assuming a 2-tier Grid operation as a function of varying wavelength granularity, fiber/wavelength cost models, traffic demand asymmetry and Grid scheduling strategy for a specific set of optical transport networks.