Introduction to algorithms
External-memory graph algorithms
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Two applications of a probabilistic search technique: Sorting X+Y and building balanced search trees
STOC '75 Proceedings of seventh annual ACM symposium on Theory of computing
Scaling and related techniques for geometry problems
STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
Cache-oblivious string dictionaries
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
IBM Journal of Research and Development
New bounds on the expected length of one-to-one codes
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
A lower bound on the expected length of one-to-one codes
IEEE Transactions on Information Theory
Hi-index | 0.89 |
We introduce a new algorithm for computing an approximately optimal binary search tree with known access probabilities or weights on items. The algorithm is simple to implement and it has two contributions. First, a randomized variant of the algorithm produces a binary search tree with expected performance that improves the previous theoretical guarantees (the performance is dependent on the value of the input random variable). More precisely, if p is the probability of accessing an item, then under expectation the item is found after searching lg1/p+0.087+lg"2(1+p"m"a"x) nodes, where p"m"a"x is the maximal probability among items. The previous best bound was lg1/p+1, albeit deterministic. For the optimal tree our upper bound implies a non-constructive performance bound of H+0.087+lg"2(1+p"m"a"x), where H is the entropy on the item distribution and the previous bound was H+1. The second contribution of the algorithm is a low cost in i/o models of cost such as the cache-oblivious model, while attaining simultaneously the above bound for the produced tree.