Cumulus: an open source storage cloud for science

  • Authors:
  • John Bresnahan;Kate Keahey;David LaBissoniere;Tim Freeman

  • Affiliations:
  • Argonne National Lab, Argonne, IL, USA;Argonne National Lab, Argonne, IL, USA;University of Chicago, Chicago, IL, USA;Univeristy of Chicago, Chicago, IL, USA

  • Venue:
  • Proceedings of the 2nd international workshop on Scientific cloud computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Amazon's S3 protocol has emerged as the de facto interface for storage in the commercial data cloud. However, it is closed source and unavailable to the numerous science data centers all over the country. Just as Amazon's Simple Storage Service (S3) provides reliable data cloud access to commercial users, scientific data centers must provide their users with a similar level of service. Ideally scientific data centers could allow the use of the same clients and protocols that have proven effective to Amazon's users. But how well does the S3 REST interface compare with the data cloud transfer services used in today's computational centers? Does it have the features needed to support the scientific community? If not, can it be extended to include these features without loss of compatibility? Can it scale and distribute resources equally when presented with common scientific the usage patterns? We address these questions by presenting Cumulus, an open source implementation of the Amazon S3 REST API. It is packaged with the Nimbus IaaS toolkit and provides scalable and reliable access to scientific data. Its performance compares favorably with that of GridFTP and SCP, and we have added features necessary to support the econometrics important to the scientific community.