Quality of service and scientific workflows
Proceedings of the IFIP TC2/WG2.5 working conference on Quality of numerical software: assessment and enhancement
GridWorkflow: A Flexible Failure Handling Framework for the Grid
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
User tools and languages for graph-based Grid workflows: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Taverna Workflows: Syntax and Semantics
E-SCIENCE '07 Proceedings of the Third IEEE International Conference on e-Science and Grid Computing
Fault Tolerance and Recovery of Scientific Workflows on Computational Grids
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Synthesizing Byzantine Fault-Tolerant Grid Application Wrapper Services
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Flexible Scientific Workflow Modeling Using Frames, Templates, and Dynamic Embedding
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Automating Performance Analysis from Taverna Workflows
CBSE '08 Proceedings of the 11th International Symposium on Component-Based Software Engineering
Service Oriented KDD: A Framework for Grid Data Mining Workflows
ICDMW '08 Proceedings of the 2008 IEEE International Conference on Data Mining Workshops
A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
An Autonomic Approach to Integrated HPC Grid and Cloud Usage
E-SCIENCE '09 Proceedings of the 2009 Fifth IEEE International Conference on e-Science
Scientific workflows and clouds
Crossroads - Plugging Into the Cloud
Adaptive exception handling for scientific workflows
Concurrency and Computation: Practice & Experience
An uncoordinated asynchronous checkpointing model for hierarchical scientific workflows
Journal of Computer and System Sciences
A Taxonomy for the Analysis of Scientific Workflow Faults
CSE '10 Proceedings of the 2010 13th IEEE International Conference on Computational Science and Engineering
Enforcing QoS in scientific workflow systems enacted over Cloud infrastructures
Journal of Computer and System Sciences
Quality of resilience as a network reliability characterization tool
IEEE Network: The Magazine of Global Internetworking
Analysing Quality of Resilience in Fish4Knowledge Video Analysis Workflows
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Hi-index | 0.00 |
The enactment of scientific workflows involves the distribution of tasks to distributed resources that exist in different administrative domains. Such resources can range in granularity from a single machine to one or more clusters and file systems. The use of such distributed resources during workflow enactment can be an error prone process and may lead to faults, which can range in type, frequency of occurrence and complexity. However, the level of fault tolerance available within many existing workflow engines varies significantly, ranging from no support available (requiring the user to intervene) or re-execution of a workflow automatically when a fault is detected. Many scientific workflows have to operate over heterogeneous infrastructure in the presence of failures -- therefore dealing with such failures in a more coherent way, so that a similar set of techniques can be applied across workflow engines is an important challenge. In this paper, we extend the concept of Quality of Service (QoS) -- where particular performance constraints need to be adhered to, with the concept of Quality of Resilience (QoR) -- a metric used to assess how resilient workflow enactment is likely to be in the presence of failures. We believe such a metric can guide: (i) the formulation of a workflow -- as a user annotated Directed Acyclic Graph (DAG); (ii) subsequent enactment of it over available resources. We identify how QoR must be considered at different levels -- from a user, workflow enactor and resource management perspectives. We identify the architectural elements that may be included within scientific workflows to support QoR and demonstrate the use of QoR in the Weka4WS data mining workflow system.