Multiple choice tries and distributed hash tables

Authors:
Luc Devroye;Gabor Lugosi;Gahyun Park;Wojciech Szpankowski
Affiliations:
McGill University, Montreal, Canada;Universitat Pompeu Fabra, Ramon Trias Fargas, Barcelona, Spain;University of Wisconsin, Whitewater, WI;Purdue University, West Lafayette, IN
Venue:
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Year:
2007

Citing 23
Cited 1

Data structures and network algorithms

Data structures and network algorithms
Trie partitioning process: limiting distributions

CAAP '86 Proceedings of the 11th colloquium on trees in algebra and programming
Some results on V-ary asymmetric tries

Journal of Algorithms
A characterization of digital search trees from the successful search viewpoint

Theoretical Computer Science
Balanced allocations (extended abstract)

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric

Journal of the ACM (JACM)
Balanced Allocations

SIAM Journal on Computing
File structures using hashing functions

Communications of the ACM
Trie memory

Communications of the ACM
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Average Case Analysis of Algorithms on Sequences

Average Case Analysis of Algorithms on Sequences
Viceroy: a scalable and dynamic emulation of the butterfly

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Laws of large numbers and tail inequalities for random tries and PATRICIA trees

Journal of Computational and Applied Mathematics - Special issue: Probabilistic methods in combinatorics and combinatorial optimization
Looking up data in P2P systems

Communications of the ACM
Novel architectures for P2P applications: the continuous-discrete approach

Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
A stochastic process on the hypercube with applications to peer-to-peer networks

Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Randomized Allocation Processes

FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
A Generic Scheme for Building Overlay Networks in Adversarial Scenarios

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Balanced binary trees for ID management and load balance in distributed hash tables

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Graph Theory With Applications

Graph Theory With Applications
Symphony: distributed hashing in a small world

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Peering peer-to-peer providers

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems

A generalization of multiple choice balls-into-bins

Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tries were introduced in 1960 by Fredkin as an efficient method for searching and sorting digital data. Recent years have seen a resurgence of interest in tries. In some of these applications, most notably in distributed hash tables one needs to design a well balanced trie. In this paper we consider tries built from n strings such that each string can be chosen from a pool of k strings, each of them generated by a discrete i.i.d. source. Three cases are considered: k = 2, k is large but fixed, and k ~ clog n. Various parameters such as height and fill-up level are analyzed. It is shown that for two-choice tries a 50% reduction in height is achieved when compared to ordinary tries. In a greedy online construction when the string that minimizes the depth of insertion for every pair is actually inserted, the height is only reduced by 25%. In order to further reduce the height by another 25%, we design a more refined on-line algorithm. The total computation time of the algorithm is O(nlog n). Furthermore, when we choose the best among k ≥ 2 strings, then for large but fixed k the height is asymptotically equal to the typical depth in a trie, a result that cannot be improved. Further improvement can be achieved if the number of choices is proportional to log n. In this case for unbiased memoryless sources highly balanced trees can be constructed by a simple greedy algorithm for which the difference between the height and the fill-up level is bounded by a constant with high probability. This, in turn, has implications for distributed hash tables, leading to a randomized ID management algorithm in peer-to-peer networks such that, with high probability, the ratio between the maximum and the minimum load of a processor is O(1).