GXP: An Interactive Shell for the Grid Environment
IWIA '04 Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Programming scientific and distributed workflow with Triana services: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Falkon: a Fast and Light-weight tasK executiON framework
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Workflows and e-Science: An overview of workflow system features and capabilities
Future Generation Computer Systems
Scientific workflow design for mere mortals
Future Generation Computer Systems
Harnessing parallelism in multicore clusters with the all-pairs and wavefront abstractions
Proceedings of the 18th ACM international symposium on High performance distributed computing
Overview of BioNLP'09 shared task on event extraction
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
GMount: An Ad Hoc and Locality-Aware Distributed File System by Using SSH and FUSE
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
A log-linear model with an n-gram reference distribution for accurate HPSG parsing
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Design and Implementation of GXP Make -- A Workflow System Based on Make
ESCIENCE '10 Proceedings of the 2010 IEEE Sixth International Conference on e-Science
Personal genomes: a new frontier in database research
DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
Hi-index | 0.00 |
This paper describes a rationale behind designing workflow systems based on the Unix make by showing a number of idioms useful for workflows comprising many tasks. It also demonstrates a specific design and implementation of such a workflow system called GXP make. GXP make supports all the features of GNU make and extends its platforms from single node systems to clusters, clouds, supercomputers, and distributed systems. Interestingly, it is achieved by a very small code base that does not modify GNU make implementation at all. While not being ideal for performance, it achieved a useful performance and scalability of dispatching one million tasks in approximately 5000 s (200 tasks per second, including dependence analysis) on an 8 core Intel Nehalem node. For real applications, recognition and classification of protein-protein interactions from biomedical texts on a supercomputer with more than 8000 cores are described.