Leveraging predefined huffman dictionaries for high compression rate and ratio

  • Authors:
  • Amit Golander;Shai Taharlev;Lior Glass;Giora Biran;Sagi Manole

  • Affiliations:
  • Tonian;IBM;University of Michigan;IBM;Tonian

  • Venue:
  • Proceedings of the 6th International Systems and Storage Conference
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The explosion of data, both in motion and in rest, along with the popularity of cloud computing, have resulted in the need for better in-line compression solutions. Current Huffman coding modes are optimal for a single metric: compression ratio (quality) or rate (performance). The ratio-focused mode, for example, further compresses the data by 15% as compared to the rate-focused mode, but takes 20% longer to execute. In this paper, we show how to balance the tradeoff between compression ratio and rate, without modifying existing standards and legacy decompression implementations. We present two Huffman encoding heuristics that achieve close-to-optimal compression rate and ratio (within 2%). Our proposed heuristics are practical - the first is better suited for hardware while the second for software implementations. We believe that such in-line compression heuristics can help enable the continued advancement of cloud computing and storage.