Prediction-based auto-scaling of scientific workflows

Authors:
Reginald Cushing;Spiros Koulouzis;Adam S. Z. Belloum;Marian Bubak
Affiliations:
University of Amsterdam;University of Amsterdam;University of Amsterdam;University of Amsterdam and AGH Krakow
Venue:
Proceedings of the 9th International Workshop on Middleware for Grids, Clouds and e-Science
Year:
2011

Citing 11
Cited 3

WS-VLAM: towards a scalable workflow system on the grid

Proceedings of the 2nd workshop on Workflows in support of large-scale science
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
VLAM-G: Interactive data driven workflow engine for Grid-enabled resources

Scientific Programming
MRGIS: A MapReduce-Enabled High Performance Workflow System for GIS

ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
The Pilot Way to Grid Resources Using glideinWMS

CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 02
Three fundamental dimensions of scientific workflow interoperability: Model of computation, language, and execution environment

Future Generation Computer Systems
Kepler + Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems

Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
AMOS: Using the Cloud for On-Demand Execution of e-Science Applications

ESCIENCE '10 Proceedings of the 2010 IEEE Sixth International Conference on e-Science
Utilization of map-reduce for parallelization of resource scheduling using MPI: PRS

Proceedings of the 2011 International Conference on Communication, Computing & Security
Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay

Future Generation Computer Systems
Collaborative e-Science Experiments and Scientific Workflows

IEEE Internet Computing

A progress and profile-driven cloud-VM for resource-efficiency and fairness in e-science environments

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Workflow as a service: an approach to workflow farming

Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
A framework for dynamically generating predictive models of workflow execution

WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper we propose a novel method for auto-scaling data-centric workflow tasks. Scaling is achieved through a prediction mechanism where the input data load on each task within a workflow is used to compute the estimated task execution time. Through load prediction, the framework can take informed decisions on scaling multiple workflow tasks independently to improve overall throughput and reduce workflow bottlenecks. This method was implemented in the WS-VLAM workflow system and with an image analyses workflow we show that this technique achieves faster data processing rates and reduces overall workflow makespan.