Advances in dataflow programming languages
ACM Computing Surveys (CSUR)
An approach for pipelining nested collections in scientific workflows
ACM SIGMOD Record
Special Issue: Workflow in Grid Systems: Editorials
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
GPFlow: An Intuitive Environment for Web Based Scientific Workflow
GCCW '06 Proceedings of the Fifth International Conference on Grid and Cooperative Computing Workshops
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Computer
Collection-Oriented scientific workflows for integrating and analyzing biological data
DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Hi-index | 0.00 |
We present a novel, web-accessible scientific workflow system which makes large-scale comparative studies accessible without programming or excessive configuration requirements. GPFlow allows a workflow defined on single input values to be automatically lifted to operate over collections of input values and supports the formation and processing of collections of values without the need for explicit iteration constructs. We introduce a new model for collection processing based on key aggregation and slicing which guarantees processing integrity and facilitates automatic association of inputs, allowing scientific users to manage the combinatorial explosion of data values inherent in large scale comparative studies. The approach is demonstrated using a core task from comparative genomics, and builds upon our previous work in supporting combined interactive and batch operation, through a lightweight web-based user interface.