Data deduplication system for supporting multi-mode
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Proof of retrieval and ownership protocols for enterprise-level data deduplication
CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
Hi-index | 0.00 |
Existing de-duplication solutions in cloud backup environment either obtain high compression ratios at the cost of heavy de-duplication overheads in terms of increased latency and reduced throughput, or maintain small de-duplication overheads at the cost of low compression ratios causing high data transmission costs, which results in a large backup window. In this paper, we present SAM, a Semantic-Aware Multitiered source de-duplication framework that first combines the global file-level de-duplication and local chunk-level deduplication, and further exploits file semantics in each stage in the framework, to obtain an optimal tradeoff between the deduplication efficiency and de-duplication overhead and finally achieve a shorter backup window than existing approaches. Our experimental results with real world datasets show that SAM not only has a higher de-duplication efficiency/overhead ratio than existing solutions, but also shortens the backup window by an average of 38.7%.