Design patterns: elements of reusable object-oriented software
Design patterns: elements of reusable object-oriented software
A survey of data provenance in e-science
ACM SIGMOD Record
VisTrails: visualization meets data management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
A Framework for Collecting Provenance in Data-Centric Scientific Workflows
ICWS '06 Proceedings of the IEEE International Conference on Web Services
Workflows for e-Science: Scientific Workflows for Grids
Workflows for e-Science: Scientific Workflows for Grids
The myGrid ontology: bioinformatics service discovery
International Journal of Bioinformatics Research and Applications
Provenance for Computational Tasks: A Survey
Computing in Science and Engineering
A Provenance-Based Fault Tolerance Mechanism for Scientific Workflows
Provenance and Annotation of Data and Processes
The Open Provenance Model: An Overview
Provenance and Annotation of Data and Processes
A break in the clouds: towards a cloud definition
ACM SIGCOMM Computer Communication Review
TAPP'09 First workshop on on Theory and practice of provenance
Exploring Scientific Workflow Provenance Using Hybrid Queries over Nested Data and Lineage Graphs
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Pipeline-centric provenance model
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Exploring many task computing in scientific workflows
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Supporting dynamic parameter sweep in adaptive and user-steered workflow
Proceedings of the 6th workshop on Workflows in support of large-scale science
Optimizing Phylogenetic Analysis Using SciHmm Cloud-based Scientific Workflow
ESCIENCE '11 Proceedings of the 2011 IEEE Seventh International Conference on eScience
An adaptive parallel execution strategy for cloud-based scientific workflows
Concurrency and Computation: Practice & Experience
Provenance traces from Chiron parallel workflow engine
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Hi-index | 0.00 |
In scientific workflows, provenance data helps scientists in understanding, evaluating and reproducing their results. Provenance data generated at runtime can also support workflow steering mechanisms. Steering facilities for workflows is considered a challenge due to its dynamic demands during execution. To steer, for example, scientists should be able to suspend (or stop) a workflow execution when the approximate solution meets (or deviates) preset criteria. These criteria are commonly evaluated based on provenance data (execution data) and domain-specific data. We claim that the final decision on whether to interfere on the workflow execution may only become feasible when workflows can be steered by scientists using provenance data enriched with domain-specific data. In this paper we propose an approach based on specialized software components, named Data Extractor (DE), to acquire domain-specific data from data files produced during a scientific workflow execution. DE gathers domain-specific data from produced data files and associates it to existing provenance data on the provenance repository. We have evaluated the proposed approach using a real bioinformatics workflow for comparative genomics executed in SciCumulus cloud workflow parallel engine.