SciPhy: a cloud-based workflow for phylogenetic analysis of drug targets in protozoan genomes
BSB'11 Proceedings of the 6th Brazilian conference on Advances in bioinformatics and computational biology
Athena: text mining based discovery of scientific workflows in disperse repositories
RED'10 Proceedings of the Third international conference on Resource Discovery
Middleware alternatives for storm surge predictions in Windows Azure
Proceedings of the 3rd workshop on Scientific Cloud Computing Date
An adaptive parallel execution strategy for cloud-based scientific workflows
Concurrency and Computation: Practice & Experience
A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds
Journal of Grid Computing
E-Clouds: A SaaS Marketplace for Scientific Computing
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
Capturing and querying workflow runtime provenance with PROV: a practical approach
Proceedings of the Joint EDBT/ICDT 2013 Workshops
A broker-based framework for multi-cloud workflows
Proceedings of the 2013 international workshop on Multi-cloud applications and federated clouds
Dimensioning the virtual cluster for parallel scientific workflows in clouds
Proceedings of the 4th ACM workshop on Scientific cloud computing
Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows
Future Generation Computer Systems
User-steering of HPC workflows: state-of-the-art and future directions
Proceedings of the 2nd ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Storm surge simulation and load balancing in Azure cloud
Proceedings of the High Performance Computing Symposium
Designing a parallel cloud based comparative genomics workflow to improve phylogenetic analyses
Future Generation Computer Systems
Runtime Dynamic Structural Changes of Scientific Workflows in Clouds
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Hi-index | 0.00 |
Most of the large-scale scientific experiments modeled as scientific workflows produce a large amount of data and require workflow parallelism to reduce workflow execution time. Some of the existing Scientific Workflow Management Systems (SWfMS) explore parallelism techniques - such as parameter sweep and data fragmentation. In those systems, several computing resources are used to accomplish many computational tasks in homogeneous environments, such as multiprocessor machines or cluster systems. Cloud computing has become a popular high performance computing model in which (virtualized) resources are provided as services over the Web. Some scientists are starting to adopt the cloud model in scientific domains and are moving their scientific workflows (programs and data) from local environments to the cloud. Nevertheless, it is still difficult for the scientist to express a parallel computing paradigm for the workflow on the cloud. Capturing distributed provenance data at the cloud is also an issue. Existing approaches for executing scientific workflows using parallel processing are mainly focused on homogeneous environments whereas, in the cloud, the scientist has to manage new aspects such as initialization of virtualized instances, scheduling over different cloud environments, impact of data transferring and management of instance images. In this paper we propose SciCumulus, a cloud middleware that explores parameter sweep and data fragmentation parallelism in scientific workflow activities (with provenance support). It works between the SWfMS and the cloud. SciCumulus is designed considering cloud specificities. We have evaluated our approach by executing simulated experiments to analyze the overhead imposed by clouds on the workflow execution time.