Petal: distributed virtual disks
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Wide-area cooperative storage with CFS
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
A Survey of Distributed Garbage Collection Techniques
IWMM '95 Proceedings of the International Workshop on Memory Management
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Erasure Coding Vs. Replication: A Quantitative Comparison
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
PAST: A Large-Scale, Persistent Peer-to-Peer Storage Utility
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
OceanStore: An Extremely Wide-Area Storage System
OceanStore: An Extremely Wide-Area Storage System
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
FAB: building distributed enterprise disk arrays from commodity components
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
FPN: A Distributed Hash Table for Commercial Applications
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
DISP: Practical, efficient, secure and fault-tolerant distributed data storage
ACM Transactions on Storage (TOS)
A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Deep Store: An Archival Storage System Architecture
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Deconstructing Commodity Storage Clusters
Proceedings of the 32nd annual international symposium on Computer Architecture
RepStore: A Self-Managing and Self-Tuning Storage Backend with Smart Bricks
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Stardust: tracking activity in a distributed storage system
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Ursa minor: versatile cluster-based storage
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Glacier: highly durable, decentralized storage despite massive correlated failures
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Scalable performance of the Panasas parallel file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Avoiding the disk bottleneck in the data domain deduplication file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
RADOS: a scalable, reliable storage service for petabyte-scale storage clusters
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Fast, inexpensive content-addressed storage in foundation
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Modular software upgrades for distributed systems
ECOOP'06 Proceedings of the 20th European conference on Object-Oriented Programming
IEEE Communications Magazine
Tapestry: a resilient global-scale overlay for service deployment
IEEE Journal on Selected Areas in Communications
HydraFS: a high-throughput file system for the HYDRAstor content-addressable storage system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Bimodal content defined chunking for backup streams
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Decentralized deduplication in SAN cluster file systems
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
ChunkStash: speeding up inline storage deduplication using flash memory
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
KVZone and the search for a write-optimized key-value store
HotStorage'10 Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems
Moving from logical sharing of guest OS to physical sharing of deduplication on virtual machine
HotSec'10 Proceedings of the 5th USENIX conference on Hot topics in security
Reliability analysis of deduplicated and erasure-coded storage
ACM SIGMETRICS Performance Evaluation Review
A study of practical deduplication
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Tradeoffs in scalable data routing for deduplication clusters
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Capo: recapitulating storage for virtual desktops
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Improving throughput for small disk requests with proximal I/O
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
PRESIDIO: A Framework for Efficient Archival Data Storage
ACM Transactions on Storage (TOS)
Anchor-driven subchunk deduplication
Proceedings of the 4th Annual International Conference on Systems and Storage
Data deduplication system for supporting multi-mode
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
Building a high-performance deduplication system
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Energy efficient file transfer mechanism using deduplication scheme
ICHIT'11 Proceedings of the 5th international conference on Convergence and hybrid information technology
A study of practical deduplication
ACM Transactions on Storage (TOS)
Live deduplication storage of virtual machine images in an open-source cloud
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Characteristics of backup workloads in production systems
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Shredder: GPU-accelerated incremental storage and computation
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
iDedup: latency-aware, inline data deduplication for primary storage
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
ISOBAR hybrid compression-I/O interleaving for large-scale parallel I/O optimization
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Primary data deduplication-large scale study and system design
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Reducing impact of data fragmentation caused by in-line deduplication
Proceedings of the 5th Annual International Systems and Storage Conference
A study on data deduplication in HPC storage systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Probabilistic deduplication for cluster-based storage systems
Proceedings of the Third ACM Symposium on Cloud Computing
Live deduplication storage of virtual machine images in an open-source cloud
Proceedings of the 12th International Middleware Conference
Space savings and design considerations in variable length deduplication
ACM SIGOPS Operating Systems Review
ACM Transactions on Storage (TOS)
A scalable inline cluster deduplication framework for big data protection
Proceedings of the 13th International Middleware Conference
COSBench: cloud object storage benchmark
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
GPFS-SNC: an enterprise storage framework for virtual-machine clouds
IBM Journal of Research and Development
Fuzzy adaptive control for heterogeneous tasks in high-performance storage systems
Proceedings of the 6th International Systems and Storage Conference
A scalable deduplication and garbage collection engine for incremental backup
Proceedings of the 6th International Systems and Storage Conference
Proceedings of the 8th International Conference on Network and Service Management
SAFE: A Source Deduplication Framework for Efficient Cloud Backup Services
Journal of Signal Processing Systems
Content-based chunk placement scheme for decentralized deduplication on distributed file systems
ICCSA'13 Proceedings of the 13th international conference on Computational Science and Its Applications - Volume 1
Memory efficient sanitization of a deduplicated storage system
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Concurrent deletion in a distributed content-addressable storage system with global deduplication
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
(Big)data in a virtualized world: volume, velocity, and variety in cloud datacenters
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
A novel approach to data deduplication over the engineering-oriented cloud systems
Integrated Computer-Aided Engineering
Hi-index | 0.00 |
HYDRAstor is a scalable, secondary storage solution aimed at the enterprise market. The system consists of a back-end architectured as a grid of storage nodes built around a distributed hash table; and a front-end consisting of a layer of access nodes which implement a traditional file system interface and can be scaled in number for increased performance. This paper concentrates on the back-end which is, to our knowledge, the first commercial implementation of a scalable, high-performance content-addressable secondary storage delivering global duplicate elimination, per-block user-selectable failure resiliency, self-maintenance including automatic recovery from failures with data and network overlay rebuilding. The back-end programming model is based on an abstraction of a sea of variable-sized, content-addressed, immutable, highly-resilient data blocks organized in a DAG (directed acyclic graph). This model is exported with a low-level API allowing clients to implement new access protocols and to add them to the system on-line. The API has been validated with an implementation of the file system interface. The critical factor for meeting the design targets has been the selection of proper data organization based on redundant chains of data containers. We present this organization in detail and describe how it is used to deliver required data services. Surprisingly, the most complex to deliver turned out to be on-demand data deletion, followed (not surprisingly) by the management of data consistency and integrity.