NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Kepler: An Extensible System for Design and Execution of Scientific Workflows
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Processing XML streams with deterministic automata and stream indexes
ACM Transactions on Database Systems (TODS)
Regular expression types for XML
ACM Transactions on Programming Languages and Systems (TOPLAS)
An approach for pipelining nested collections in scientific workflows
ACM SIGMOD Record
Taxonomy of XML schema languages using formal language theory
ACM Transactions on Internet Technology (TOIT)
An Efficient XPath Query Processor for XML Streams
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Workflow automation for processing plasma fusion simulation data
Proceedings of the 2nd workshop on Workflows in support of large-scale science
A transducer-based XML query processor
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Petri net + nested relational calculus = dataflow
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
Collection-Oriented scientific workflows for integrating and analyzing biological data
DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Scientific workflow design for mere mortals
Future Generation Computer Systems
Hi-index | 0.00 |
Simulation and computer-aided data analysis have become an integral part of many traditional sciences and have spawned virtual observatories and even entirely new disciplines, e.g. bioinformatics. Scientific workflow systems are built for modeling and automation of scientific applications, to increase scientists' productivity. In this paper, we present desiderata, which we believe scientific workflow systems should have from a scientist's point-of-view. In particular, they should support data modeling, be resilient against input data changes, should check workflow well-formedness, as well as automatically optimize workflow specifications for efficient execution. We argue that current approaches do not adequately address these desiderata, in particular, conventional workflows need to be changed radically to cope with common changes in the input data structure. Workflows built using a Collection-Oriented Modeling and Design (Comad) approach, on the other side, exhibit much greater resilience to input changes. We propose to further extend and formalize Comad by creating a separate configuration layer to gap between scientific functionality (e.g., scripts, programs, or web-services) and the high-level workflow graph. The design of this gap language and an appropriate type system is part of the proposed Ph.D. project. As an initial result we show how to adopt XML regular expression types on the workflow channels and how to characterize actor behavior by defining actor signatures. This allows us to propagate schema information through the workflow, to predict workflow output schema (well-formedness), as well as to automatically optimize data routing for less overall shippings of data as well as for an increase in workflow concurrency.