A running time improvement for the two thresholds two divisors algorithm

  • Authors:
  • Teng-Sheng Moh;BingChun Chang

  • Affiliations:
  • San Jose State University, San Jose, CA;San Jose State University, San Jose, CA

  • Venue:
  • Proceedings of the 48th Annual Southeast Regional Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Chunking algorithms play an important role in hash-based data de-duplication systems. The Basic Sliding Window (BSW) algorithm is the first prototype of a content-based chunking algorithm that can handle most types of data. The Two Thresholds Two Divisors (TTTD) algorithm was proposed to improve the BSW algorithm by controlling the chunk-size variations. We conducted a series of systematic experiments to evaluate the performances of these two algorithms. We also proposed a new improvement for the TTTD algorithm. Our new approach reduced about 6% of the running time and 50% of the large-sized chunks, and also brought other significant benefits.