Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Condor-G: A Computation Management Agent for Multi-Institutional Grids
Cluster Computing
Giggle: a framework for constructing scalable replica location services
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Performance and Scalability of a Replica Location Service
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Scheduling of scientific workflows in the ASKALON grid environment
ACM SIGMOD Record
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A framework for reliable and efficient data placement in distributed computing systems
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
What makes workflows work in an opportunistic environment?: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Task scheduling strategies for workflow-based applications in grids
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
Wide Area Data Replication for Scientific Collaborations
GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
Efficient replica maintenance for distributed storage systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Scheduling strategies for mapping application workflows onto the grid
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Advance reservation policies for workflows
JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
A data placement service for petascale applications
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
A data placement strategy in scientific cloud workflows
Future Generation Computer Systems
Journal of Parallel and Distributed Computing
A MapReduce workflow system for architecting scientific data intensive applications
Proceedings of the 2nd International Workshop on Software Engineering for Cloud Computing
Workflow overhead analysis and optimizations
Proceedings of the 6th workshop on Workflows in support of large-scale science
A data dependency based strategy for intermediate data storage in scientific cloud workflow systems
Concurrency and Computation: Practice & Experience
A Workflow-Aware Storage System: An Opportunity Study
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A classification of file placement and replication methods on grids
Future Generation Computer Systems
Hi-index | 0.00 |
Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be staged into or out of computations efficiently or by replicating them for improved performance and reliability. In particular, we propose to study the relationship between data placement services and workflow management systems. In this paper, we explore the interactions between two services used in large-scale science today. We evaluate the benefits of prestaging data using the Data Replication Service versus using the native data stage-in mechanisms of the Pegasus workflow management system. We use the astronomy application, Montage, for our experiments and modify it to study the effect of input data size on the benefits of data prestaging. As the size of input data sets increases, prestaging using a data placement service can significantly improve the performance of the overall analysis.