Efficient locally trackable deduplication in replicated systems

  • Authors:
  • João Barreto;Paulo Ferreira

  • Affiliations:
  • Distributed Systems Group, INESC-ID, Technical University of Lisbon;Distributed Systems Group, INESC-ID, Technical University of Lisbon

  • Venue:
  • Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel technique for distributed data deduplication in distributed storage systems. We combine version tracking with high-precision, local similarity detection techniques. When compared with the prominent techniques of delta encoding and compare-by-hash, our solution borrows most advantages that distinguish each such alternative. A thorough experimental evaluation, comparing a full-fledged implementation of our technique against popular systems based on delta encoding and compare-by-hash, confirms gains in performance and transferred volumes for a wide range of real workloads and scenarios.