Improved Behaviour of Tries by the "Symmetrization" of the Source

  • Authors:
  • Yuriy A. Reznik;Wojciech Szpankowski

  • Affiliations:
  • -;-

  • Venue:
  • DCC '02 Proceedings of the Data Compression Conference
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

We propose a pre-processing technique for improving performance of digital tree (trie)-based search algorithms under asymmetric memoryless sources. This technique (which we call a symmetrization of the source) bijectively maps the sequences of symbols from the original (asymmetric) source into symbols of an output alphabet with a more uniform distribution. We introduce a criterion of efficiency for such a mapping, and demonstrate that a problem of finding an optimal for a given source (or universal) symmetrization transform is equivalent to a problem of constructing a minimum redundancy variable-length-to-block code for this source (or class of sources). Based on this result, we propose search algorithms that incorporate known (optimal for a given source and universal) variable-length-to-block codes and study their asymptotic behaviour. We complement our analysis with a description of an efficient algorithm for universal symmetrization of binary memoryless sources, and compare the performance of the resulting search structure with the standard tries.