StoreGPU: exploiting graphics processing units to accelerate distributed storage systems
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
SCAN-Lite: enterprise-wide analysis on the cheap
Proceedings of the 4th ACM European conference on Computer systems
Sparse indexing: large scale, inline deduplication using sampling and locality
FAST '09 Proccedings of the 7th conference on File and storage technologies
On GPU's viability as a middleware accelerator
Cluster Computing
Managing security of virtual machine images in a cloud environment
Proceedings of the 2009 ACM workshop on Cloud computing security
Efficient locally trackable deduplication in replicated systems
Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
Efficient locally trackable deduplication in replicated systems
Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
A GPU accelerated storage system
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
VMFlock: virtual machine co-migration for the cloud
Proceedings of the 20th international symposium on High performance distributed computing
Hash challenges: Stretching the limits of compare-by-hash in distributed data deduplication
Information Processing Letters
WAN optimized replication of backup datasets using stream-informed delta compression
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
WAN-optimized replication of backup datasets using stream-informed delta compression
ACM Transactions on Storage (TOS)
A study on data deduplication in HPC storage systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Block locality caching for data deduplication
Proceedings of the 6th International Systems and Storage Conference
Concurrent deletion in a distributed content-addressable storage system with global deduplication
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
File recipe compression in data deduplication systems
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
We have developed a new storage system called the Jumbo Store (JS) based on encoding directory tree snapshots as graphs called HDAGs whose nodes are small variable-length chunks of data and whose edges are hash pointers. We store or transmit each node only once and encode using landmark-based chunking plus some new tricks. This leads to very efficient incremental upload and storage of successive snapshots: we report compression factors over 16x for real data; a comparison shows that our incremental upload sends only 1/5 as much data as Rsync. To demonstrate the utility of the Jumbo Store, we have integrated it into HP Labs' prototype Utility Rendering Service (URS), which accepts rendering data in the form of directory tree snapshots from small teams of animators, renders one or more requested frames using a processor farm, and then makes the rendered frames available for download. Efficient incremental upload is crucial to the URS's usability and responsiveness because of the teams' slow Internet connections. We report on the JS's performance during a major field test of the URS where the URS was offered to 11 groups of animators for 10 months during an animation showcase to create high-quality short animations.