SNMP-based monitoring agents and heuristic scheduling for large-scale grids

  • Authors:
  • Edgar Magaña;Laurent Lefevre;Masum Hasan;Joan Serrat

  • Affiliations:
  • Cisco Systems, Inc., San Jose, CA and INRIA, RESO, LIP Laboratory, UMR, CNRS, ENS Lyon, UCB, France;Universitat Politècnica de Catalunya, Barcelona, Spain;Cisco Systems, Inc., San Jose, CA;INRIA, RESO, LIP Laboratory, UMR, CNRS, ENS Lyon, UCB, France

  • Venue:
  • OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents both, SNMP-based resource monitoring and heuristic resource scheduling systems targeted to manage large-scale Grids. This approach involves two phases: resource monitoring and resource scheduling. Resource monitoring (even discovery) phase is supported by the SNMP-based Balanced Load Monitoring Agents for Resource Scheduling (SBLOMARS). This resource monitoring and discovery approach is different from current distributed monitoring systems in three main areas. Firstly, it reaches a high level of generality by the integration of SNMP technology and thus, it is offering an alternative solution to handle heterogeneous operating platforms. Secondly, it solves the flexibility problem by the implementation of complex dynamic software structures, which are used to monitor from simple personal computers to robust multi-processor systems or clusters with even multiple hard disks and storage partitions. Finally, the scalability problem is covered by the distribution of the monitoring system into a set of submonitoring instances which are specific per each kind of computational resource to monitor (processor, memory, software, network and storage). Resource scheduling phase is supported by the Balanced Load Multi-Constrain Resource Scheduler (BLOMERS). This resource scheduler is implemented based on a Genetic Algorithm, as an alternative to solve the inherent NP-hard problem for resource scheduling in large-scale Grids. We show some graphical and textual snapshots of resource availability reports as well as a scheduling scenario in the Grid5000 platform. We have obtained a scalable scheduler with an extraordinary load balanced between all nodes participating in the Grid.