The NetLogger Methodology for High Performance Distributed Systems Performance Analysis
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
DataMover: Robust Terabyte-Scale Multi-file Replication over Wide-Area Networks
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Actor-oriented models for codesign: balancing re-use and performance
Formal methods and models for system design
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Workflow automation for processing plasma fusion simulation data
Proceedings of the 2nd workshop on Workflows in support of large-scale science
Actor-oriented design of scientific workflows
ER'05 Proceedings of the 24th international conference on Conceptual Modeling
Hi-index | 0.00 |
The Center for Plasma Edge Simulation project aims to automate the tedious tasks of simulation monitoring, data archival and coupling simulation codes using the Kepler scientific workflow environment. The technology has been successfully applied for migrating a combustion data archive of 10TB from NERSC to ORNL, where there were no other automated solutions for this task. This paper describes the workflow that migrates large files from mass storage systems using external tools and temporary staging to disks, performing different stages in a pipeline-parallel fashion, parallelizing file transfers and doing special checkpointing to make the workflow restartable and also perform operations that failed earlier. The advantage of creating/using such a workflow over specialized data migration services is its independence from specific systems so it can be used by configuring the external tools to be used. The advantage over scripts is the robust exection (handling failures and timeouts) and efficiency (parallelization wherever possible).