Exploring many task computing in scientific workflows

  • Authors:
  • Eduardo Ogasawara;Daniel de Oliveira;Fernando Chirigati;Carlos Eduardo Barbosa;Renato Elias;Vanessa Braganholo;Alvaro Coutinho;Marta Mattoso

  • Affiliations:
  • Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil;Federal University of Rio de Janeiro - Rio de Janeiro -- Brazil

  • Venue:
  • Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the main advantages of using a scientific workflow management system (SWfMS) to orchestrate data flows among scientific activities is to control and register the whole workflow execution. The execution of activities within a workflow with high performance computing (HPC) presents challenges in SWfMS execution control. Current solutions leave the scheduling to the HPC queue system. Since the workflow execution engine does not run on remote clusters, SWfMS are not aware of the parallel strategy of the workflow execution. Consequently, remote execution control and provenance registry of the parallel activities is very limited from the SWfMS side. This work presents a set of components to be included on the workflow specification of any SWMfS to control parallelization of activities as MTC. In addition, these components can gather provenance data during remote workflow execution. Through these MTC components, the parallelization strategy can be registered and reused, and provenance data can be uniformly queried. We have evaluated our approach by performing parameter sweep parallelization in solving the incompressible 3D Navier-Stokes equations. Experimental results show the performance gains with the additional benefits of distributed provenance support.