Towards a complete characterization of tries

  • Authors:
  • Gahyun Park;Wojciech Szpankowski

  • Affiliations:
  • Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN

  • Venue:
  • SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tries and suffix trees are the most popular data structures on words. Tries were introduced in 1960 by Fredkin as an efficient method for searching and sorting digital data. Since then myriad of novel trie applications were found such as dynamic hashing, conflict resolution algorithms, leader election algorithms, IP addresses lookup, coding, polynomial factorization, Lempel-Ziv compression schemes, and so on. Furthermore, various analyses of tries reveal new fundamental properties of strings appearing in those applications. Parameters of interest are the (partial) fillup level (the largest full level of the trie), shortest path, height (longest path), typical depth, and path length (sum of depths). All of these parameters are analyzed here in a unifying manner by studying the external and internal profiles. A profile of a tree at level k is the number of nodes (internal or external) at level k. We derive recurrences for both profiles and solve them asymptotically for various ranges of k when the strings stored in the trie are generated by a memoryless source (extension to a Markov source is possible). In particular, we present asymptotic results for the average profile, the variance and the limiting distribution. As consequences we find the height, shortest path, fillup level, and the depth. These results are derived here by methods of analytic algorithmics such as generating functions, Mellin transform, poissonization and depoissonization, and the saddle point method.