An adaptive parallel execution strategy for cloud-based scientific workflows

  • Authors:
  • Daniel de Oliveira;Eduardo Ogasawara;Kary Ocaña;Fernanda Baião;Marta Mattoso

  • Affiliations:
  • COPPE-UFRJ/Federal University of Rio de Janeiro, Rio de Janeiro, Brazil;COPPE-UFRJ/Federal University of Rio de Janeiro, Rio de Janeiro, Brazil and CEFET-RJ, Rio de Janeiro, Brazil;COPPE-UFRJ/Federal University of Rio de Janeiro, Rio de Janeiro, Brazil;NP2Tec-UNIRIO/Federal University of the State of Rio de Janeiro, Rio de Janeiro, Brazil;COPPE-UFRJ/Federal University of Rio de Janeiro, Rio de Janeiro, Brazil

  • Venue:
  • Concurrency and Computation: Practice & Experience
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many of the existing large-scale scientific experiments modeled as scientific workflows are compute-intensive. Some scientific workflow management systems already explore parallel techniques, such as parameter sweep and data fragmentation, to improve performance. In those systems, computing resources are used to accomplish many computational tasks in high performance environments, such as multiprocessor machines or clusters. Meanwhile, cloud computing provides scalable and elastic resources that can be instantiated on demand during the course of a scientific experiment, without requiring its users to acquire expensive infrastructure or to configure many pieces of software. In fact, because of these advantages some scientists have already adopted the cloud model in their scientific experiments. However, this model also raises many challenges. When scientists are executing scientific workflows that require parallelism, it is hard to decide a priori the amount of resources to use and how long they will be needed because the allocation of these resources is elastic and based on demand. In addition, scientists have to manage new aspects such as initialization of virtual machines and impact of data staging. SciCumulus is a middleware that manages the parallel execution of scientific workflows in cloud environments. In this paper, we introduce an adaptive approach for executing parallel scientific workflows in the cloud. This approach adapts itself according to the availability of resources during workflow execution. It checks the available computational power and dynamically tunes the workflow activity size to achieve better performance. Experimental evaluation showed the benefits of parallelizing scientific workflows using the adaptive approach of SciCumulus, which presented an increase of performance up to 47.1%. Copyright © 2011 John Wiley & Sons, Ltd.