An Evaluation of the Cost and Performance of Scientific Workflows on Amazon EC2
Journal of Grid Computing
Enabling data and compute intensive workflows in bioinformatics
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Hi-index | 3.84 |
Summary: We have developed an RNA-Seq analysis workflow for single-ended Illumina reads, termed RseqFlow. This workflow includes a set of analytic functions, such as quality control for sequencing data, signal tracks of mapped reads, calculation of expression levels, identification of differentially expressed genes and coding SNPs calling. This workflow is formalized and managed by the Pegasus Workflow Management System, which maps the analysis modules onto available computational resources, automatically executes the steps in the appropriate order and supervises the whole running process. RseqFlow is available as a Virtual Machine with all the necessary software, which eliminates any complex configuration and installation steps. Availability and implementation: http://genomics.isi.edu/rnaseq Contact:wangying@xmu.edu.cn; knowles@med.usc.edu; deelman@isi.edu; tingchen@usc.edu Supplementary information:Supplementary data are available at Bioinformatics online.