Toward fine-grained online task characteristics estimation in scientific workflows

Authors:
Rafael Ferreira da Silva;Gideon Juve;Ewa Deelman;Tristan Glatard;Frédéric Desprez;Douglas Thain;Benjamin Tovar;Miron Livny
Affiliations:
University of Lyon, Villeurbanne, France and University of Southern California, Marina Del Rey, CA;University of Southern California, Marina Del Rey, CA;University of Southern California, Marina Del Rey, CA;University of Lyon, Villeurbanne, France and McGill University, Canada;University of Lyon, Lyon, France;University of Notre Dame, Notre Dame, IN;University of Notre Dame, Notre Dame, IN;University of Wisconsin Madison, Madison, WI
Venue:
WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Year:
2013

Citing 27
Cited 0

Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
The Vision of Autonomic Computing

Computer
Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems

HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
ASKALON: a tool set for cluster and Grid computing: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles

Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Pegasus: A framework for mapping complex scientific workflows onto distributed systems

Scientific Programming
Automatic resource specification generation for resource selection

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
The Grid Workloads Archive

Future Generation Computer Systems
Runtime Prediction Based Grid Scheduling of Parameter Sweep Jobs

APSCC '08 Proceedings of the 2008 IEEE Asia-Pacific Services Computing Conference
Estimating Resource Needs for Time-Constrained Workflows

ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Trace-based evaluation of job runtime and queue wait time predictions in grids

Proceedings of the 18th ACM international symposium on High performance distributed computing
Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
NP-complete scheduling problems

Journal of Computer and System Sciences
Grid Computing Workloads

IEEE Internet Computing
The Grid Observatory

CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Markovian Workload Characterization for QoS Prediction in the Cloud

CLOUD '11 Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing
Measuring TeraGrid: workload characterization for a high-performance computing federation

International Journal of High Performance Computing Applications
Bi-criteria Workflow Tasks Allocation and Scheduling in Cloud Computing Environments

CLOUD '12 Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing
Characterizing and profiling scientific workflows

Future Generation Computer Systems
Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids

Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
A science-gateway workload archive to study pilot jobs, user activity, bag of tasks, task sub-steps, and workflow executions

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Workload characterization on a production Hadoop cluster: A case study on Taobao

IISWC '12 Proceedings of the 2012 IEEE International Symposium on Workload Characterization (IISWC)
A family of heuristics for agent-based elastic Cloud bag-of-tasks concurrent scheduling

Future Generation Computer Systems
Multi-task averaging via task clustering

SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Self-healing of workflow activity incidents on distributed computing infrastructures

Future Generation Computer Systems
Characterizing workflow-based activity on a production e-infrastructure using provenance data

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly used by scheduling algorithms and resource provisioning techniques to provide successful and efficient workflow executions. These methods assume that accurate estimations are available, but in production systems it is hard to compute such estimates with good accuracy. In this work, we first profile three real scientific workflows collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize workflow task needs based on these profiles. Our method estimates task runtime, disk space, and memory consumption based on the size of tasks input data. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets by using a clustering technique. Task behavior estimates are done based on the ratio parameter/input data size if they are correlated, or based on the mean value. However, task dependencies in scientific workflows lead to a chain of estimation errors. To correct such errors, we propose an online estimation process based on the MAPE-K loop where task executions are constantly monitored and estimates are updated accordingly. Experiment results show that our online estimation process yields much more accurate predictions than an offline approach, where all task needs are estimated at once.