The expected profile of digital search trees

  • Authors:
  • Michael Drmota;Wojciech Szpankowski

  • Affiliations:
  • Institut für Diskrete Mathematik und Geometrie, TU Wien, A-1040 Wien, Austria;Department of Computer Science, Purdue University, W. Lafayette, IN 47907, USA

  • Venue:
  • Journal of Combinatorial Theory Series A
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A digital search tree (DST) is a fundamental data structure on words that finds various applications from the popular Lempel-Ziv@?78 data compression scheme to distributed hash tables. The profile of a DST measures the number of nodes at the same distance from the root; it depends on the number of stored strings and the distance from the root. Most parameters of DST (e.g., depth, height, fillup) can be expressed in terms of the profile. We study here asymptotics of the average profile in a DST built from sequences generated independently by a memoryless source. After representing the average profile by a recurrence, we solve it using a wide range of analytic tools. This analysis is surprisingly demanding but once it is carried out it reveals an unusually intriguing and interesting behavior. The average profile undergoes phase transitions when moving from the root to the longest path: at first it resembles a full tree until it abruptly starts growing polynomially and oscillating in this range. These results are derived by methods of analytic combinatorics such as generating functions, Mellin transform, poissonization and depoissonization, the saddle point method, singularity analysis and uniform asymptotic analysis.