High-performance remote access to climate simulation data: a challenge problem for data grid technologies

  • Authors:
  • Bill Allcock;Ian Foster;Veronika Nefedova;Ann Chervenak;Ewa Deelman;Carl Kesselman;Jason Lee;Alex Sim;Arie Shoshani;Bob Drach;Dean Williams

  • Affiliations:
  • Argonne National Laboratory;Argonne National Laboratory;Argonne National Laboratory;University of Southern California;University of Southern California;University of Southern California;Lawrence Berkeley National Laboratory;Lawrence Berkeley National Laboratory;Lawrence Berkeley National Laboratory;Lawrence Livermore National Laboratory;Lawrence Livermore National Laboratory

  • Venue:
  • Proceedings of the 2001 ACM/IEEE conference on Supercomputing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In numerous scientific disciplines, terabyte and soon petabyte-scale data collections are emerging as critical community resources. A new class of Data Grid infrastructure is required to support management, transport, distributed access to, and analysis of these datasets by potentially thousands of users. Researchers who face this challenge include the Climate Modeling community, which performs long-duration computations accompanied by frequent output of very large files that must be further analyzed. We describe the Earth System Grid prototype, which brings together advanced analysis, replica management, data transfer, request management, and other technologies to support high-performance, interactive analysis of replicated data. We present performance results that demonstrate our ability to manage the location and movement of large datasets from the user's desktop. We report on experiments conducted over SciNET at SC'2000, where we achieved peak performance of 1.55Gb/s and sustained performance of 512.9Mb/s for data transfers between Texas and California.