Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Example

Authors:
Ewa Deelman;Scott Callaghan;Edward Field;Hunter Francoeur;Robert Graves;Nitin Gupta;Vipin Gupta;Thomas H. Jordan;Carl Kesselman;Philip Maechling;John Mehringer;Gaurang Mehta;David Okaya;Karan Vahi;Li Zhao
Affiliations:
USC Information Sciences Institute, USA;US Geological Survey, USA;University of Southern California, Los Angeles, USA;US Geological Survey, USA;URS Corporation, USA;US Geological Survey, USA;US Geological Survey, USA;US Geological Survey, USA;USC Information Sciences Institute, USA;US Geological Survey, USA;US Geological Survey, USA;USC Information Sciences Institute, USA;US Geological Survey, USA;USC Information Sciences Institute, USA;US Geological Survey, USA
Venue:
E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
Year:
2006

Citing 0
Cited 17

A provisioning model and its comparison with best-effort for performance-cost optimization in grids

Proceedings of the 16th international symposium on High performance distributed computing
Workflow task clustering for best effort systems with Pegasus

Proceedings of the 15th ACM Mardi Gras conference: From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities
Optimizing workflow data footprint

Scientific Programming - Dynamic Computational Workflows: Discovery, Optimization and Scheduling
Recording Process Documentation in the Presence of Failures

Methods, Models and Tools for Fault Tolerance
From data to knowledge to discoveries: Artificial intelligence and scientific workflows

Scientific Programming
Wings for Pegasus: creating large-scale scientific applications using semantic representations of computational workflows

IAAI'07 Proceedings of the 19th national conference on Innovative applications of artificial intelligence - Volume 2
Scientific workflows and clouds

Crossroads - Plugging Into the Cloud
Scaling up workflow-based applications

Journal of Computer and System Sciences
Efficiently supporting secure and reliable collaboration in scientific workflows

Journal of Computer and System Sciences
Experiences with resource provisioning for scientific workflows using Corral

Scientific Programming
Grids and Clouds: Making Workflow Applications Work in Heterogeneous Distributed Environments

International Journal of High Performance Computing Applications
Searching workflows with hierarchical views

Proceedings of the VLDB Endowment
Schedule optimization for data processing flows on the cloud

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Modeling and simulation of distributed computing workflows in heterogeneous network environments

Simulation
Workflow overhead analysis and optimizations

Proceedings of the 6th workshop on Workflows in support of large-scale science
Characterizing and profiling scientific workflows

Future Generation Computer Systems
A Case Study into Using Common Real-Time Workflow Monitoring Infrastructure for Scientific Workflows

Journal of Grid Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an analysis designed to compute probabilistic seismic hazard curves for sites in the Los Angeles area. We explain which software tools were used to build to the system, describe their functionality and interactions. We show the results of running the CyberShake analysis that included over 250,000 jobs using resources available through SCEC and the TeraGrid.