WS-VLAM: towards a scalable workflow system on the grid
Proceedings of the 2nd workshop on Workflows in support of large-scale science
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
VLAM-G: Interactive data driven workflow engine for Grid-enabled resources
Scientific Programming
MRGIS: A MapReduce-Enabled High Performance Workflow System for GIS
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
The Pilot Way to Grid Resources Using glideinWMS
CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 02
Future Generation Computer Systems
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
AMOS: Using the Cloud for On-Demand Execution of e-Science Applications
ESCIENCE '10 Proceedings of the 2010 IEEE Sixth International Conference on e-Science
Utilization of map-reduce for parallelization of resource scheduling using MPI: PRS
Proceedings of the 2011 International Conference on Communication, Computing & Security
Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay
Future Generation Computer Systems
Collaborative e-Science Experiments and Scientific Workflows
IEEE Internet Computing
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Workflow as a service: an approach to workflow farming
Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
A framework for dynamically generating predictive models of workflow execution
WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Hi-index | 0.01 |
In this paper we propose a novel method for auto-scaling data-centric workflow tasks. Scaling is achieved through a prediction mechanism where the input data load on each task within a workflow is used to compute the estimated task execution time. Through load prediction, the framework can take informed decisions on scaling multiple workflow tasks independently to improve overall throughput and reduce workflow bottlenecks. This method was implemented in the WS-VLAM workflow system and with an image analyses workflow we show that this technique achieves faster data processing rates and reduces overall workflow makespan.