Automating Data-Throttling Analysis for Data-Intensive Workflows

  • Authors:
  • Ricardo J. Rodriguez;Rafael Tolosana-Calasanz;Omer F. Rana

  • Affiliations:
  • -;-;-

  • Venue:
  • CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data movement between tasks in scientific workflows has received limited attention compared to task execution. Often the staging of data between tasks is either assumed or the time delay in data transfer is considered to be negligible (compared to task execution). Where data consists of files, such file transfers are accomplished as fast as the network links allow, and once transferred, the files are buffered/stored at their destination. Where a task requires multiple files to execute (from different tasks), it must, however, remain idle until all files are available. Hence, network bandwidth and buffer/storage within a workflow are often not used effectively. We propose an automated workflow structural analysis method for Directed Acyclic Graphs (DAGs) which utilises information from previous workflow executions. The method obtains data-throttling values for the data transfer to enable network bandwidth and buffer/storage capacity to be managed more efficiently. We convert a DAG representation into a Petri net model and analyse the resulting graph using an iterative method to compute data-throttling values. Our approach is demonstrated using the Montage workflow.