Monitoring of Grid scientific workflows

  • Authors:
  • Bartosz Balis;Marian Bubak;Bartłomiej Łabno

  • Affiliations:
  • (Correspd. E-mail: balis@aqh.edu.pl) Institute of Computer Science, AGH University of Science and Technology, Krakow, Poland;Institute of Computer Science & ACC CYFRONET AGH, Krakow, Poland;ACC CYFRONET AGH, Krakow, Poland

  • Venue:
  • Scientific Programming - Large-Scale Programming Tools and Environments
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scientific workflows are a means of conducting in silico experiments in modern computing infrastructures for e-Science, often built on top of Grids. Monitoring of Grid scientific workflows is essential not only for performance analysis but also to collect provenance data and gather feedback useful in future decisions, e.g., related to optimization of resource usage. In this paper, basic problems related to monitoring of Grid scientific workflows are discussed. Being highly distributed, loosely coupled in space and time, heterogeneous, and heavily using legacy codes, workflows are exceptionally challenging from the monitoring point of view. We propose a Grid monitoring architecture for scientific workflows. Monitoring data correlation problem is described and an algorithm for on-line distributed collection of monitoring data is proposed. We demonstrate a prototype implementation of the proposed workflow monitoring architecture, the GEMINI monitoring system, and its use for monitoring of a real-life scientific workflow.