Improving restore speed for backup systems that use inline chunk-based deduplication

  • Authors:
  • Mark Lillibridge;Kave Eshghi;Deepavali Bhagwat

  • Affiliations:
  • HP Labs;HP Labs;HP Storage

  • Venue:
  • FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Slow restoration due to chunk fragmentation is a serious problem facing inline chunk-based data deduplication systems: restore speeds for the most recent backup can drop orders of magnitude over the lifetime of a system. We study three techniques--increasing cache size, container capping, and using a forward assembly area-- for alleviating this problem. Container capping is an ingest-time operation that reduces chunk fragmentation at the cost of forfeiting some deduplication, while using a forward assembly area is a new restore-time caching and prefetching technique that exploits the perfect knowledge of future chunk accesses available when restoring a backup to reduce the amount of RAM required for a given level of caching at restore time. We show that using a larger cache per stream--we see continuing benefits even up to 8 GB--can produce up to a 5-16X improvement, that giving up as little as 8% deduplication with capping can yield a 2-6X improvement, and that using a forward assembly area is strictly superior to LRU, able to yield a 2-4X improvement while holding the RAM budget constant.