ACM Computing Surveys (CSUR)
Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Why workflows break — Understanding and combating decay in Taverna workflows
E-SCIENCE '12 Proceedings of the 2012 IEEE 8th International Conference on E-Science (e-Science)
Hi-index | 0.00 |
Dependency management remains a major challenge for all forms of software. A program implemented in a given environment typically has many implicit dependencies on programs, libraries, and other objects present within that environment. Moving applications between different runtime environments is certain to fail due to the existence of those external dependencies. Workflows particularly suffer from dependency management problems, precisely because they tie together multiple independent programs into a coherent whole. To address the problem of workflow decay, we propose applying the old idea of a "linker" into the new context of workflow systems. We have implemented a linker for the Makeflow workflow system, and extended the concept to apply recursively to executables and scripted languages within the workflow. We evaluate the system by applying it to a selection of bioinformatics workflows including BLAST, BWA, and SHRiMP, enabling them to be moved across multiple computation environments. We also show that the portability provided by packaging allows for improved performance.