GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Virtual workspaces: Achieving quality of service and quality of life in the Grid
Scientific Programming - Dynamic Grids and Worldwide Computing
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Amazon S3 for science grids: a viable solution?
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
Flying Low: Simple Leases with Workspace Pilot
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Queue - Scalable Web Services
The Eucalyptus Open-Source Cloud-Computing System
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
High throughput data-compression for cloud storage
Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
The Hadoop Distributed File System
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Evaluating cloud storage services for tightly-coupled applications
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Towards a unified taxonomy and architecture of cloud frameworks
Future Generation Computer Systems
Managing volunteer resources in the cloud
International Journal of Computational Science and Engineering
Hi-index | 0.00 |
Amazon's S3 protocol has emerged as the de facto interface for storage in the commercial data cloud. However, it is closed source and unavailable to the numerous science data centers all over the country. Just as Amazon's Simple Storage Service (S3) provides reliable data cloud access to commercial users, scientific data centers must provide their users with a similar level of service. Ideally scientific data centers could allow the use of the same clients and protocols that have proven effective to Amazon's users. But how well does the S3 REST interface compare with the data cloud transfer services used in today's computational centers? Does it have the features needed to support the scientific community? If not, can it be extended to include these features without loss of compatibility? Can it scale and distribute resources equally when presented with common scientific the usage patterns? We address these questions by presenting Cumulus, an open source implementation of the Amazon S3 REST API. It is packaged with the Nimbus IaaS toolkit and provides scalable and reliable access to scientific data. Its performance compares favorably with that of GridFTP and SCP, and we have added features necessary to support the econometrics important to the scientific community.