The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Some challenges and grand challenges for computational intelligence
Journal of the ACM (JACM)
What next?: A dozen information-technology research goals
Journal of the ACM (JACM)
Web100: extended TCP instrumentation for research, education and diagnosis
ACM SIGCOMM Computer Communication Review
Resource allocation in a middleware for streaming data
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
The Pegasus portal: web based grid computing
Proceedings of the 2005 ACM symposium on Applied computing
A Web Service Model for Climate Data Access on the Grid
International Journal of High Performance Computing Applications
Petascale Computational Systems
Computer
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
An Atmospheric Sciences Workflow and its implementation with Web Services
Future Generation Computer Systems
Short communication: Analysis of self-describing gridded geoscience data with netCDF Operators (NCO)
Environmental Modelling & Software
Hi-index | 0.00 |
Geoscience analysis is currently limited by cumbersome access and manipulation of large datasets from remote sources. Due to their data-heavy and compute-light nature, these analysis workloads represent a class of applications unsuited to a computational grid optimized for compute-intensive applications. We present the Script Workflow Analysis for MultiProcessing (SWAMP) system, which relocates data-intensive workflows from scientists' workstations to the hosting datacenters in order to reduce data transfer and exploit locality. Our colocation of computation and data leverages the typically reductive characteristics of these workflows, allowing SWAMP to complete workflows in a fraction of the time and with much less data transfer. We describe SWAMP's implementation and interface, which is designed to leverage scientists' existing script-based workflows. Tests with a production geoscience workflow show drastic improvements not only in overall execution time, but in computation time as well. SWAMP's workflow analysis capability allows it to detect dependencies, optimize I/O, and dynamically parallelize execution. Benchmarks quantify the drastic reduction in transfer time, computation time, and end-to-end execution time.