A Provenance-Based Fault Tolerance Mechanism for Scientific Workflows
Provenance and Annotation of Data and Processes
Using Explicit Control Processes in Distributed Workflows to Gather Provenance
Provenance and Annotation of Data and Processes
Workflows and e-Science: An overview of workflow system features and capabilities
Future Generation Computer Systems
Scientific workflow design for mere mortals
Future Generation Computer Systems
Heterogeneous composition of models of computation
Future Generation Computer Systems
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Scientific workflow design with data assembly lines
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Efficiently supporting secure and reliable collaboration in scientific workflows
Journal of Computer and System Sciences
Information flow analysis of scientific workflows
Journal of Computer and System Sciences
RECYCLE: Learning looping workflows from annotated traces
ACM Transactions on Intelligent Systems and Technology (TIST)
Scientific workflow reuse through conceptual workflows on the virtual imaging platform
Proceedings of the 6th workshop on Workflows in support of large-scale science
User-steering of HPC workflows: state-of-the-art and future directions
Proceedings of the 2nd ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Computer-Assisted Scientific Workflow Design
Journal of Grid Computing
A graph distance based metric for data oriented workflow retrieval with variable time constraints
Expert Systems with Applications: An International Journal
Runtime Dynamic Structural Changes of Scientific Workflows in Clouds
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Hi-index | 0.00 |
Data-centric scientific workflows are often modeled as dataflow process networks. The simplicity of the dataflow framework facilitates workflow design, analysis, and optimization. However, modeling "control-flow intensive" tasks using dataflow constructs often leads to overly complicated workflows that are hard to comprehend, reuse, and maintain. We describe a generic framework, based on scientific workflow templates and frames, for embedding control-flow intensive subtasks within dataflow process networks. This approach can seamlessly handle complex control-flow without sacrificing the benefits of dataflow. We illustrate our approach with a real-world scientific workflow from the astrophysics domain, requiring remote execution and file transfer in a semi-reliable environment. For such workflows, we also describe a 3-layered architecture based on frames and templates where the top-layer consists of an overall dataflow process network, the second layer consists of a tranducer template for modeling the desired control-flow behavior, and the bottom layer consists of frames inside the template that are specialized by embedding the desired component implementation. Our approach can enable scientific workflows that are more robust (faulttolerance strategies can be defined by control-flow driven transducer templates) and at the same time more reusable, since the embedding of frames and templates yields more structured and modular workflow designs.