Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Designing and mining multi-terabyte astronomy archives: the Sloan Digital Sky Survey
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Integrating parallel file I/O and database support for high-performance scientific data management
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Active Storage for Large-Scale Data Mining and Multimedia
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
SSD '93 Proceedings of the Third International Symposium on Advances in Spatial Databases
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Grid Computing: Making the Global Infrastructure a Reality
Grid Computing: Making the Global Infrastructure a Reality
Scientific data repositories: designing for a moving target
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Identity Boxing: A New Technique for Consistent Global Identity
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database
SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
ROARS: a robust object archival system for data intensive scientific computing
Distributed and Parallel Databases
Scripting distributed scientific workflows using Weaver
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
As scientific research becomes more data intensive, there is an increasing need for scalable, reliable, and high performance storage systems. Such data repositories must provide both data archival services and rich metadata, and cleanly integrate with large scale computing resources. ROARS is a hybrid approach to distributed storage that provides both large, robust, scalable storage and efficient rich metadata queries for scientific applications. In this paper, we demonstrate that ROARS is capable of importing and exporting large quantities of data, migrating data to new storage nodes, providing robust fault tolerance, and generating materialized views based on metadata queries. Our experimental results demonstrate that ROARS' aggregate throughput scales with the number of concurrent clients while providing fault-tolerant data access. ROARS is currently being used to store 5.1TB of data in our local biometrics repository.