Optimizing bioinformatics workflows for data analysis using cloud management techniques
Proceedings of the 6th workshop on Workflows in support of large-scale science
BPELPower-A BPEL execution engine for geospatial web services
Computers & Geosciences
Parallel software architecture for experimental workflows in computational biology on clouds
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Dis2PPI: A Workflow Designed to Integrate Proteomic and Genetic Disease Data
International Journal of Knowledge Discovery in Bioinformatics
Dis2PPI: A Workflow Designed to Integrate Proteomic and Genetic Disease Data
International Journal of Knowledge Discovery in Bioinformatics
Managing and Optimizing Bioinformatics Workflows for Data Analysis in Clouds
Journal of Grid Computing
Fundamenta Informaticae - Scalable Workflow Enactment Engines and Technology
Hi-index | 3.84 |
Motivation: The rapidly increasing amounts of data available from new high-throughput methods have made data processing without automated pipelines infeasible. As was pointed out in several publications, integration of data and analytic resources into workflow systems provides a solution to this problem, simplifying the task of data analysis. Various applications for defining and running workflows in the field of bioinformatics have been proposed and published, e.g. Galaxy, Mobyle, Taverna, Pegasus or Kepler. One of the main aims of such workflow systems is to enable scientists to focus on analysing their datasets instead of taking care for data management, job management or monitoring the execution of computational tasks. The currently available workflow systems achieve this goal, but fundamentally differ in their way of executing workflows. Results: We have developed the Conveyor software library, a multitiered generic workflow engine for composition, execution and monitoring of complex workflows. It features an open, extensible system architecture and concurrent program execution to exploit resources available on modern multicore CPU hardware. It offers the ability to build complex workflows with branches, loops and other control structures. Two example use cases illustrate the application of the versatile Conveyor engine to common bioinformatics problems. Availability: The Conveyor application including client and server are available at http://conveyor.cebitec.uni-bielefeld.de. Contact:conveyor@CeBiTec.Uni-Bielefeld.DE; blinke@ceBiTec.Uni-Bielefeld.De. Supplementary information:Supplementary data are available at Bioinformatics online.