Simgrid: A Toolkit for the Simulation of Application Scheduling
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
A Report from the U.S. National Science Foundation Blue Ribbon Panel on Cyberinfrastructure
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Performance Prediction in Production Environments
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Service-Oriented Environments for Dynamically Interacting with Mesoscale Weather
Computing in Science and Engineering
WS-Messenger: A Web Services-Based Messaging System for Service-Oriented Grid Computing
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
A Mechanism for Creating Scientific Application Services On-demand from Workflows
ICPPW '06 Proceedings of the 2006 International Conference Workshops on Parallel Processing
Toward a doctrine of containment: grid hosting with adaptive resource control
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Workflows for e-Science: Scientific Workflows for Grids
Workflows for e-Science: Scientific Workflows for Grids
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Performability modeling for scheduling and fault tolerance strategies for scientific workflows
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Towards Case-Based Support for e-Science Workflow Generation by Mining Provenance
ECCBR '08 Proceedings of the 9th European conference on Advances in Case-Based Reasoning
Design and Evaluation of Opal2: A Toolkit for Scientific Software as a Service
SERVICES '09 Proceedings of the 2009 Congress on Services - I
Provenance Information Model of Karma Version 3
SERVICES '09 Proceedings of the 2009 Congress on Services - I
VGrADS: enabling e-Science workflows on grids and clouds with fault tolerance
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Performance variability of highly parallel architectures
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Hi-index | 0.00 |
Scientific workflows have become an integral part of cyber infrastructure as their computational complexity and data sizes have grown. However, the complexity of the distributed infrastructure makes design of new workflows, determining the right management policies, debugging, testing or reproduction of errors challenging. Today, workflow engines manage the dependencies between tasks of workflows and there are tools available to wrap scientific codes. There is a need for a customizable, isolated and manageable testing container for design, evaluation and deployment of distributed workflows. To build such an environment, we need to be able to model and represent, capture and possibly reuse the execution flows within each task of a workflow that accurately captures the execution behavior. In this paper, we present the design and implementation of WORKEM, an extensible framework that can be used to represent and emulate workflow execution state. We also detail the use of the framework in two specific case studies (a) design and testing of an orchestration system (b) generation of a provenance database. Our evaluation shows that the framework has minimal overheads and can be scaled to run hundreds of workflows in short durations of time and with a high amount of parallelism.