Zoo: a desktop experiment management environment
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Discovering models of software processes from event-based data
ACM Transactions on Software Engineering and Methodology (TOSEM)
From Centralized Workflow Specification to Distributed WorkflowExecution
Journal of Intelligent Information Systems - Special issue on workflow management systems
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Mining Process Models from Workflow Logs
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Modeling Scientific Experiments with an Object Data Model
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Analyzing the Critical Path for the Well-Formed Workflow Schema
DASFAA '01 Proceedings of the 7th International Conference on Database Systems for Advanced Applications
Giggle: a framework for constructing scalable replica location services
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Kepler: An Extensible System for Design and Execution of Scientific Workflows
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Extracting the workflow critical path from the extended well-formed workflow schema
Journal of Computer and System Sciences
Integrating databases and workflow systems
ACM SIGMOD Record
SBAC-PAD '05 Proceedings of the 17th International Symposium on Computer Architecture on High Performance Computing
Workflow-based grid applications
Future Generation Computer Systems
Mining Workflow Patterns through Event-Data Analysis
SAINT-W '05 Proceedings of the 2005 Symposium on Applications and the Internet Workshops
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
GridDB: a data-centric overlay for scientific grids
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Towards a model of provenance and user views in scientific workflows
DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Editorial: Special section on workflow systems and applications in e-Science
Future Generation Computer Systems
Detecting and resolving unsound workflow views for correct provenance analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Designing Workflows on the Fly Using e-BioFlow
ICSOC-ServiceWave '09 Proceedings of the 7th International Joint Conference on Service-Oriented Computing
Generating sound workflow views for correct provenance analysis
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
Existing workflow management systems assume that scientists have a well-specified workflow design before the execution. In reality, a lot of scientific discoveries are made as a result of a dynamic process, where scientists keep proposing new hypotheses and verifying them through multiple tries of various experiments before achieving successful experimental results. Consequently, not all the experiments in a workflow execution have necessarily contributed to the final result. In this paper, we investigate the problem of effectively reproducing the results of previous scientific workflow executions by discovering the critical experiments leading to the success and the logical constraints on their execution order. Relational schema and SQL queries have been designed for effectively recording the workflow execution log, efficiently identifying the critical experiments from the log, and recommending experiment reproduction strategies to users. Furthermore, we propose optimization techniques for evaluating such SQL queries according to the unique characteristics of the log data. Experimental evaluations demonstrate the performance speedup of our approach.