Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
GriPhyN and LIGO, Building a Virtual Data Grid for Gravitational Wave Scientists
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Flexibility, Manageability, and Performance in a Grid Storage Appliance
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Condor-G: A Computation Management Agent for Multi-Institutional Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Workflow management in GriPhyN
Grid resource management
The Grid2003 Production Grid: Principles and Practice
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Scheduling of scientific workflows in the ASKALON grid environment
ACM SIGMOD Record
A Grid service broker for scheduling e-Science applications on global data Grids: Research Articles
Concurrency and Computation: Practice & Experience - Middleware for Grid Computing
Task scheduling strategies for workflow-based applications in grids
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Scheduling strategies for mapping application workflows onto the grid
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Operating System Support for Space Allocation in Grid Storage Systems
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
The cost of doing science on the cloud: the Montage example
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Future Generation Computer Systems
A data placement strategy in scientific cloud workflows
Future Generation Computer Systems
Grids and Clouds: Making Workflow Applications Work in Heterogeneous Distributed Environments
International Journal of High Performance Computing Applications
Partitioning and scheduling workflows across multiple sites with storage constraints
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
A classification of file placement and replication methods on grids
Future Generation Computer Systems
Hi-index | 0.00 |
In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific workflows onto distributed resources where the workflows are data-intensive, requiring large amounts of data storage, and the resources have limited storage resources. Our approach is two-fold: we minimize the amount of space a workflow requires during execution by removing data files at runtime when they are no longer needed and we demonstrate that workflows may have to be restructured to reduce the overall data footprint of the workflow. We show the results of our data management and workflow restructuring solutions using a Laser Interferometer Gravitational-Wave Observatory (LIGO) application and an astronomy application, Montage, running on a large-scale production grid-the Open Science Grid. We show that although reducing the data footprint of Montage by 48% can be achieved with dynamic data cleanup techniques, LIGO Scientific Collaboration workflows require additional restructuring to achieve a 56% reduction in data space usage. We also examine the cost of the workflow restructuring in terms of the application's runtime.