Assembling approximately optimal binary search trees efficiently using arithmetics

Authors:
Jussi Kujala
Affiliations:
Institute of Software Systems, Tampere University of Technology, Finland
Venue:
Information Processing Letters
Year:
2009

Citing 11
Cited 0

Introduction to algorithms

Introduction to algorithms
External-memory graph algorithms

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information

The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
Cache-Oblivious Algorithms

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Two applications of a probabilistic search technique: Sorting X+Y and building balanced search trees

STOC '75 Proceedings of seventh annual ACM symposium on Theory of computing
Scaling and related techniques for geometry problems

STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
Cache-oblivious string dictionaries

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Arithmetic coding

IBM Journal of Research and Development
New bounds on the expected length of one-to-one codes

IEEE Transactions on Information Theory
Alphabetic codes revisited

IEEE Transactions on Information Theory
A lower bound on the expected length of one-to-one codes

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.89

Visualization

Abstract

We introduce a new algorithm for computing an approximately optimal binary search tree with known access probabilities or weights on items. The algorithm is simple to implement and it has two contributions. First, a randomized variant of the algorithm produces a binary search tree with expected performance that improves the previous theoretical guarantees (the performance is dependent on the value of the input random variable). More precisely, if p is the probability of accessing an item, then under expectation the item is found after searching lg1/p+0.087+lg"2(1+p"m"a"x) nodes, where p"m"a"x is the maximal probability among items. The previous best bound was lg1/p+1, albeit deterministic. For the optimal tree our upper bound implies a non-constructive performance bound of H+0.087+lg"2(1+p"m"a"x), where H is the entropy on the item distribution and the previous bound was H+1. The second contribution of the algorithm is a low cost in i/o models of cost such as the cache-oblivious model, while attaining simultaneously the above bound for the produced tree.