Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
Principles of programming with complex objects and collection types
ICDT '92 Selected papers of the fourth international conference on Database theory
An XML query engine for network-bound data
The VLDB Journal — The International Journal on Very Large Data Bases
Issues in data stream management
ACM SIGMOD Record
Stream processing of XPath queries with predicates
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XPath-logic and XPathLog: A logic-programming style XML data manipulation language
Theory and Practice of Logic Programming
Kepler: An Extensible System for Design and Execution of Scientific Workflows
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Implementing a scalable XML publish/subscribe system using relational database systems
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Triana: A Graphical Web Service Composition and Execution Toolkit
ICWS '04 Proceedings of the IEEE International Conference on Web Services
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
An approach for pipelining nested collections in scientific workflows
ACM SIGMOD Record
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Seven bottlenecks to workflow reuse and repurposing
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
A conceptual modeling and execution framework for process based scientific applications
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Advanced data flow support for scientific grid workflow applications
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Modeling and optimization of scientific workflows
Ph.D. '08 Proceedings of the 2008 EDBT Ph.D. workshop
Comparative Studies Simplified in GPFlow
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life
Provenance and Annotation of Data and Processes
Provenance and the Price of Identity
Provenance and Annotation of Data and Processes
Scientific workflow design for mere mortals
Future Generation Computer Systems
Towards a Formal Semantics for the Process Model of the Taverna Workbench. Part II
Fundamenta Informaticae
Scientific workflow design with data assembly lines
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Fine-grained and efficient lineage querying of collection-based workflow provenance
Proceedings of the 13th International Conference on Extending Database Technology
A formal model of dataflow repositories
DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Project histories: managing data provenance across collection-oriented scientific workflow runs
DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Deductive web services: an ontology-driven approach for service interoperability in life science
OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems - Volume Part II
ProtocolDB: storing scientific protocols with a domain ontology
WISE'07 Proceedings of the 2007 international conference on Web information systems engineering
Parallelizing XML data-streaming workflows via MapReduce
Journal of Computer and System Sciences
A formal semantics for the Taverna 2 workflow model
Journal of Computer and System Sciences
PrIMe: A methodology for developing provenance-aware applications
ACM Transactions on Software Engineering and Methodology (TOSEM)
Towards a Formal Semantics for the Process Model of the Taverna Workbench. Part II
Fundamenta Informaticae
A workflow for the prediction of the effects of residue substitution on protein stability
PRIB'13 Proceedings of the 8th IAPR international conference on Pattern Recognition in Bioinformatics
Hi-index | 0.00 |
Steps in scientific workflows often generate collections of results, causing the data flowing through workflows to become increasingly nested. Because conventional workflow components (or actors) typically operate on simple or application-specific data types, additional actors often are required to manage these nested data collections. As a result, conventional workflows become increasingly complex as data becomes more nested. This paper describes a new paradigm for developing scientific workflows that transparently manages nested data collections. Collection-oriented workflows have a number of advantages over conventional approaches including simpler workflow designs (e.g., requiring fewer actors and control-flow constructs) that are invariant under changes in data nesting. Our implementation within the Kepler scientific workflow system enables the explicit representation of collections and collection schemas, concurrent operation over collection contents via multi-level pipeline parallelism, and allows collection-aware actors to be composed readily from conventional actors.