Algorithms in C
A characterization of digital search trees from the successful search viewpoint
Theoretical Computer Science
Asymptotic behavior of the Lempel-Ziv parsing scheme and digital search trees
Theoretical Computer Science - Special volume on mathematical analysis of algorithms (dedicated to D. E. Knuth)
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Average Case Analysis of Algorithms on Sequences
Average Case Analysis of Algorithms on Sequences
Novel architectures for P2P applications: the continuous-discrete approach
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Towards a complete characterization of tries
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Profile of tries
Analytic Combinatorics
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Average profile and limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm
IEEE Transactions on Information Theory
Hi-index | 0.00 |
A digital search tree (DST) -- one of the most fundamental data structures on words -- is a digital tree in which keys (strings, words) are stored directly in (internal) nodes. Such trees find myriad of applications from the popular Lempel-Ziv'78 data compression scheme to distributed hash tables. The profile of a DST measures the number of nodes at the same distance from the root; it is a function of the number of stored strings and the distance from the root. Most parameters of DST (e.g., height, fill-up) can be expressed in terms of the profile. However, from the inception of DST, the analysis of the profile has been elusive and it has become a prominent open problem in the area of analysis of algorithms. We make here the first, but decisive, step towards solving this problem. We present a precise analysis of the average profile when stored strings are generated by a biased memoryless source. The main technical difficulty of analyzing the profile lies in solving a sophisticated recurrence equation. We present such a solution for the Poissonized version of the problem (i.e., when the number of stored strings is generated by a Poisson distribution) in the Mellin transform domain. To accomplish it, we introduce a novel functional operator that allows us to express the solution in an explicit form, and then using analytic algorithmics tools to extract the asymptotic behavior of the profile. This analysis is surprisingly demanding but once it is carried out it reveals unusually intriguing and interesting behavior. The average profile undergoes several phase transitions when moving from the root to the longest path. At first, it resembles a full tree until it abruptly starts growing polynomially and it oscillates in this range. Our results are derived by methods of analytic algorithmics such as generating functions, Mellin transform, Poissonization and de-Poissonization, the saddle-point method, singularity analysis and uniform asymptotic analysis.