Measuring the effectiveness of throttled data transfers on data-intensive workflows

  • Authors:
  • Ricardo J. Rodríguez;Rafael Tolosana-Calasanz;Omer F. Rana

  • Affiliations:
  • Dpto. de Informática e Ingeniería de Sistemas, Universidad de Zaragoza, Zaragoza, Spain;Dpto. de Informática e Ingeniería de Sistemas, Universidad de Zaragoza, Zaragoza, Spain;School of Computer Science & Informatics, Cardiff University, Cardiff, United Kingdom

  • Venue:
  • KES-AMSTA'12 Proceedings of the 6th KES international conference on Agent and Multi-Agent Systems: technologies and applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In data intensive workflows, which often involve files, transfer between tasks is typically accomplished as fast as the network links allow, and once transferred, the files are buffered/stored at their destination. Where a task requires multiple files to execute (from different previous tasks), it must remain idle until all files are available. Hence, network bandwidth and buffer/storage within a workflow are often not used effectively. In this paper, we are quantitatively measuring the impact that applying an intelligent data movement policy can have on buffer/storage in comparison with existing approaches. Our main objective is to propose a metric that considers a workflow structure expressed as a Directed Acyclic Graph (DAG), and performance information collected from historical past executions of the considered workflow. This metric is intended for use at the design-stage, to compare various DAG structures and evaluate their potential for optimisation (of network bandwidth and buffer use).