A Scalable Wide-Area Grid Resource Management Framework

  • Authors:
  • Mohamed El-Darieby;Diwakar Krishnamurthy

  • Affiliations:
  • University of Regina, Canada;University of Calgary, Canada

  • Venue:
  • ICNS '06 Proceedings of the International conference on Networking and Services
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grid computing systems federate resources belonging to several organizations to support applications with large computation and storage needs. Effective resource management is crucial for realizing the promise of the Grid. However, current Grid resource management frameworks have a number of limitations including poor scalability, and inadequate support for Quality of Service (QoS). This paper describes a novel and scalable Grid resource management framework that can address these limitations. The framework relies on a hierarchical organization of resources and resource managers (RM) within an organization. Resources are assigned to jobs through decentralized inter and intra organizational collaborations between RMs. The framework employs a hierarchical information aggregation scheme that permits scalable Grid resource management. Such a capability allows more intelligent placement of workloads across the Grid than is feasible with traditional resource managers. For example, loads can be balanced across the Grid clusters to avoid over utilization of resources resulting in better QoS for jobs. Hierarchical segmentation of Grid resources allows the framework to handle dynamic situations (e.g., failure recovery, and nodes joining the Grid). The improved scalability of the framework, however, comes at the price of incurring additional complexity and overhead. Sophisticated protocols need to be designed to build and provide the functionality of the hierarchy. Considering the benefits and limitations of our approach, we believe the hierarchical framework is best suited for managing planetary scale Grid systems supporting "embarrassingly" parallel jobs that require computational resources beyond the borders of an organization.