Supporting practical content-addressable caching with CZIP compression

  • Authors:
  • KyoungSoo Park;Sunghwan Ihm;Mic Bowman;Vivek S. Pai

  • Affiliations:
  • Princeton University;Princeton University;Intel Research;Princeton University

  • Venue:
  • ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Content-based naming (CBN) enables content sharing across similar files by breaking files into position-independent chunks and naming these chunks using hashes of their contents. While a number of research systems have recently used custom CBN approaches internally to good effect, there has not yet been any mechanism to use CBN in a general-purposeway. In this paper, we demonstrate a practical approach to applying CBN without requiring disruptive changes to end systems. We develop CZIP, a CBN compression scheme which reduces data sizes by eliminating redundant chunks, compresses chunks using existing schemes, and facilitates sharing within files, across files, and across machines by explicitly exposing CBN chunk hashes. CZIP-aware caching systems can exploit the CBN information to reduce storage space, reduce bandwidth consumption, and increase performance, while content providers and middleboxes can selectively encode their most suitable content. We show that CZIP compares well to stand-alone compression schemes, that a CBN cache for CZIP is easily implemented, and that a CZIP-aware CDN produces significant benefits.