Planning spatial workflows to optimize grid performance
Proceedings of the 2006 ACM symposium on Applied computing
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Design and implementation of GXP make - A workflow system based on make
Future Generation Computer Systems
Hi-index | 0.00 |
This paper proposes Pwrake, a parallel and distributed flexible workflow management tool based on Rake, a domain specific language for building applications in the Ruby programming language. Rake is a similar tool to make and ant. It uses a Rakefile that is equivalent to a Makefile in make, but written in Ruby. Due to a flexible and extensible language feature, Rake would be a powerful workflow management language. The Pwrake extends Rake to manage distributed and parallel workflow executions that include remote job submission and management of parallel executions. This paper discusses the design and implementation of the Pwrake, and demonstrates its power of language and extensibility of the system using a practical e-Science data-intensive workflow in astronomical data analysis on the Gfarm file system as a case study. Extending a scheduling algorithm to be aware of file locations, 20% of speed up is observed using 8 nodes (32 cores) in a PC cluster. Using two PC clusters located in different institutions, the file location aware scheduling shows scalable speedup. The extensible Pwrake is a promising workflow management tool even for wide-area data analysis.