Atomicity and provenance support for pipelined scientific workflows

  • Authors:
  • Liqiang Wang;Shiyong Lu;Xubo Fei;Artem Chebotko;H. Victoria Bryant;Jeffrey L. Ram

  • Affiliations:
  • Department of Computer Science, University of Wyoming, USA;Department of Computer Science, Wayne State University, USA;Department of Computer Science, Wayne State University, USA;Department of Computer Science, Wayne State University, USA and Department of Computer Science, University of Texas - Pan American, USA;Department of Computer Science, University of Wyoming, USA;Department of Physiology, Wayne State University, USA

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today many significant scientific discoveries are achieved through complex and distributed scientific computations that are structured and represented as scientific workflows. Although atomicity is a well studied topic in transaction processing and business workflows, such an important capability needs to be revisited in a scientific workflow environment. Firstly, the semantics of atomicity needs to be defined in a dataflow-oriented scientific workflow model, particularly for pipelined execution of hierarchical scientific workflows. Secondly, in a scientific workflow environment, atomic regions are specified or inferred dynamically as needed and are committed implicitly, which are in contrast to a priori well-defined transaction boundaries and explicit commits in transaction processing and business workflows. Finally, although atomicity and provenance are related to each other, their interactions and relationships have never been explored in the literature. In this paper, we propose: (i) an architecture for scientific workflow management systems that supports both provenance and atomicity; (ii) a dataflow-oriented atomicity model that supports the notions of commit and abort; and (iii) a dataflow-oriented provenance model that supports querying and visualizing provenance.