Extendible hashing—a fast access method for dynamic files
ACM Transactions on Database Systems (TODS)
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
Reclaiming Space from Duplicate Files in a Serverless Distributed File System
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
IBM Storage Tank-- A heterogeneous scalable SAN file system
IBM Systems Journal
Memory resource management in VMware ESX server
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Providing tunable consistency for a parallel file store
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Single instance storage in Windows® 2000
WSS'00 Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Avoiding the disk bottleneck in the data domain deduplication file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Fast, inexpensive content-addressed storage in foundation
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
HYDRAstor: a Scalable Secondary Storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Experiences with content addressable storage and virtual disks
WIOV'08 Proceedings of the First conference on I/O virtualization
Lithium: virtual machine storage for the cloud
Proceedings of the 1st ACM symposium on Cloud computing
I/O Deduplication: Utilizing content similarity to improve I/O performance
ACM Transactions on Storage (TOS)
Storage deduplication for Virtual Ad Hoc Network testbed by File-level Block Sharing
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Tracking back references in a write-anywhere file system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
I/O deduplication: utilizing content similarity to improve I/O performance
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
ChunkStash: speeding up inline storage deduplication using flash memory
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
mClock: handling throughput variability for hypervisor IO scheduling
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Fast and secure laptop backups with encrypted de-duplication
LISA'10 Proceedings of the 24th international conference on Large installation system administration
A study of practical deduplication
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Capo: recapitulating storage for virtual desktops
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Data deduplication system for supporting multi-mode
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Energy efficient file transfer mechanism using deduplication scheme
ICHIT'11 Proceedings of the 5th international conference on Convergence and hybrid information technology
How to tell if your cloud files are vulnerable to drive crashes
Proceedings of the 18th ACM conference on Computer and communications security
DeFFS: Duplication-eliminated flash file system
Computers and Electrical Engineering
A study of practical deduplication
ACM Transactions on Storage (TOS)
An empirical analysis of similarity in virtual machine images
Proceedings of the Middleware 2011 Industry Track Workshop
Modeling virtualized applications using machine learning techniques
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Live deduplication storage of virtual machine images in an open-source cloud
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Shredder: GPU-accelerated incremental storage and computation
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
VM aware journaling: improving journaling file system performance in virtualization environments
Software—Practice & Experience
Demand based hierarchical QoS using storage resource pools
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Primary data deduplication-large scale study and system design
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
A study on data deduplication in HPC storage systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Droplet: A Distributed Solution of Data Deduplication
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Live deduplication storage of virtual machine images in an open-source cloud
Proceedings of the 12th International Middleware Conference
Data deduplication using dynamic chunking algorithm
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part II
GPFS-SNC: an enterprise storage framework for virtual-machine clouds
IBM Journal of Research and Development
Block locality caching for data deduplication
Proceedings of the 6th International Systems and Storage Conference
A scalable deduplication and garbage collection engine for incremental backup
Proceedings of the 6th International Systems and Storage Conference
RevDedup: a reverse deduplication storage system optimized for reads to latest backups
Proceedings of the 4th Asia-Pacific Workshop on Systems
Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud
ACM Transactions on Storage (TOS)
DEDIS: distributed exact deduplication for primary storage infrastructures
Proceedings of the 4th annual Symposium on Cloud Computing
Content-based chunk placement scheme for decentralized deduplication on distributed file systems
ICCSA'13 Proceedings of the 13th international conference on Computational Science and Its Applications - Volume 1
Low-cost data deduplication for virtual machine backup in cloud storage
HotStorage'13 Proceedings of the 5th USENIX conference on Hot Topics in Storage and File Systems
Memory efficient sanitization of a deduplicated storage system
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Concurrent deletion in a distributed content-addressable storage system with global deduplication
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
A novel approach to data deduplication over the engineering-oriented cloud systems
Integrated Computer-Aided Engineering
Hi-index | 0.00 |
File systems hosting virtual machines typically contain many duplicated blocks of data resulting in wasted storage space and increased storage array cache footprint. Deduplication addresses these problems by storing a single instance of each unique data block and sharing it between all original sources of that data. While deduplication is well understood for file systems with a centralized component, we investigate it in a decentralized cluster file system, specifically in the context of VM storage. We propose DEDE, a block-level deduplication system for live cluster file systems that does not require any central coordination, tolerates host failures, and takes advantage of the block layout policies of an existing cluster file system. In DEDE, hosts keep summaries of their own writes to the cluster file system in shared on-disk logs. Each host periodically and independently processes the summaries of its locked files, merges them with a shared index of blocks, and reclaims any duplicate blocks. DEDE manipulates metadata using general file system interfaces without knowledge of the file system implementation. We present the design, implementation, and evaluation of our techniques in the context of VMware ESX Server. Our results show an 80% reduction in space with minor performance overhead for realistic workloads.