On using virtual circuits for GridFTP transfers

  • Authors:
  • Z. Liu;M. Veeraraghavan;Z. Yan;C. Tracy;J. Tie;I. Foster;J. Dennis;J. Hick;Y. Li;W. Yang

  • Affiliations:
  • University of Virginia (UVA), Charlottesville, VA;University of Virginia (UVA), Charlottesville, VA;University of Virginia (UVA), Charlottesville, VA;Energy Sciences Network (ESnet), Berkeley, CA;University of Chicago, Chicago, IL;University of Chicago, Chicago, IL and Argonne National Laboratory, Argonne, IL;National Center for Atmospheric Research (NCAR), Boulder, CO;National Energy Research Scientific Computing Center (NERSC), Berkeley, CA;SLAC National Accelerator Laboratory, Menlo Park, CA;SLAC National Accelerator Laboratory, Menlo Park, CA

  • Venue:
  • SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of this work is to characterize scientific data transfers and to determine the suitability of dynamic virtual circuit service for these transfers instead of the currently used IP-routed service. Specifically, logs collected by servers executing a commonly used scientific data transfer application, GridFTP, are obtained from three US super-computing/scientific research centers, NERSC, SLAC, and NCAR, and analyzed. Dynamic virtual circuit (VC) service, a relatively new offering from providers such as ESnet and Internet2, allows for the selection of a path on which a rate-guaranteed connection is established prior to data transfer. Given VC setup overhead, the first analysis of the GridFTP transfer logs characterizes the duration of sessions, where a session consists of multiple back-to-back transfers executed in batch mode between the same two GridFTP servers. Of the NCAR-NICS sessions analyzed, 56% of all sessions (90% of all transfers) would have been long enough to be served with dynamic VC service. An analysis of transfer logs across four paths, NCAR-NICS, SLAC-BNL, NERSC-ORNL and NERSC-ANL, shows significant throughput variance, where NICS, BNL, ORNL, and ANL are other US national laboratories. For example, on the NERSC-ORNL path, the inter-quartile range was 695 Mbps, with a maximum value of 3.64 Gbps and a minimum value of 758 Mbps. An analysis of the impact of various factors that are potential causes of this variance is also presented.