Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Min-wise independent permutations
Journal of Computer and System Sciences - 30th annual ACM symposium on theory of computing
A low-bandwidth network file system
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Compactly encoding unstructured inputs with differential compression
Journal of the ACM (JACM)
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
Rules of Thumb in Data Engineering
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Redundancy elimination within large collections of files
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Alternatives for detecting redundancy in storage systems data
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Challenges in managing dependable data systems
ACM SIGMETRICS Performance Evaluation Review - Design, implementation, and performance of storage systems
A fresh look at the reliability of long-term digital storage
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
TAPER: tiered approach for eliminating redundancy in replica synchronization
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
POTSHARDS: secure long-term storage without encryption
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Avoiding the disk bottleneck in the data domain deduplication file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Proceedings of the 4th ACM international workshop on Storage security and survivability
Demystifying data deduplication
Proceedings of the ACM/IFIP/USENIX Middleware '08 Conference Companion
Preservation DataStores: new storage paradigm for preservation environments
IBM Journal of Research and Development
IZO: applications of large-window compression to virtual machine management
LISA'08 Proceedings of the 22nd conference on Large installation system administration conference
Sparse indexing: large scale, inline deduplication using sampling and locality
FAST '09 Proccedings of the 7th conference on File and storage technologies
HYDRAstor: a Scalable Secondary Storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Smoke and mirrors: reflecting files at a geographically remote location without loss of performance
FAST '09 Proccedings of the 7th conference on File and storage technologies
The effectiveness of deduplication on virtual machine disk images
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Multi-level comparison of data deduplication in a backup scenario
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
POTSHARDS—a secure, recoverable, long-term archival storage system
ACM Transactions on Storage (TOS)
Multi-agent Based Personal File Management Using Case Based Reasoning
HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Managing computer files via artificial intelligence approaches
Artificial Intelligence Review
BlobSeer: how to enable efficient versioning for large object storage under heavy access concurrency
Proceedings of the 2009 EDBT/ICDT Workshops
Distributed archiving storage system based on CAS
VECIMS'09 Proceedings of the 2009 IEEE international conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems
HydraFS: a high-throughput file system for the HYDRAstor content-addressable storage system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Bimodal content defined chunking for backup streams
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Moving from logical sharing of guest OS to physical sharing of deduplication on virtual machine
HotSec'10 Proceedings of the 5th USENIX conference on Hot topics in security
PRESIDIO: A Framework for Efficient Archival Data Storage
ACM Transactions on Storage (TOS)
Anchor-driven subchunk deduplication
Proceedings of the 4th Annual International Conference on Systems and Storage
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories
ACM Transactions on Storage (TOS)
iDedup: latency-aware, inline data deduplication for primary storage
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Hash based disk imaging using AFF4
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Multimedia Tools and Applications
SAFE: A Source Deduplication Framework for Efficient Cloud Backup Services
Journal of Signal Processing Systems
SBBS: A sliding blocking algorithm with backtracking sub-blocks for duplicate data detection
Expert Systems with Applications: An International Journal
Concurrent deletion in a distributed content-addressable storage system with global deduplication
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
We present the Deep Store archival storage architecture, a large-scale storage system that stores immutable dataefficiently and reliably for long periods of time. Archived data is stored across a cluster of nodes and recorded to hard disk. The design differentiates itself from traditional file systems by eliminating redundancy within and across files, distributing content for scalability, associating rich metadata with content, and using variable levels of replication based on the importance or degree of dependency of each piece of stored data. We evaluate the foundations of our design, including PRESIDIO, a virtual content-addressable storage framework with multiple methods for inter-file and intra-file compression that effectively addresses the data-dependent variability of data compression. We measure content and metadata storage efficiency, demonstrate the need for a variable-degree replication model, and provide preliminary results for storage performance.