Hierarchical binary search

Authors:
Arthur Gill
Affiliations:
Univ. of California, Berkeley, CA
Venue:
Communications of the ACM
Year:
1980

Citing 3
Cited 1

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Combinatorial Algorithms: Theory and Practice

Combinatorial Algorithms: Theory and Practice

A compendium of key search references

ACM SIGIR Forum

Quantified Score

Hi-index	48.22

Visualization

Abstract

In hierarchical search the data structure holding the file keys is partitioned into substructures of the same type; these are searched consecutively until the queried key is found or the substructures are exhausted. The interest here is in the conditions under which the performance of a hierarchical organization of static files is superior to that of the nonhierarchical organization and in the construction of the hierarchy when these conditions are met. The performance criterion is the average number of comparisons in a successful search, where averaging extends over all keys and over all permutations of the key's access probabilities. General properties of hierarchical search are first derived, and attention is then focused on the hierarchical binary organization—the special case where each of the data substructures is a sorted array (or a balanced binary tree) and where the keys are accessed by binary search. It is shown that an advantageous two-stage hierarchy is always implementable when the keys' access density function &phgr;(i) is “steeper” than Zipf's density function &zgr;(i)—the steeper it is, the greater the advantage. A simple method for constructing the two-stage hierarchy is formulated, based on finding the intersection of &phgr;(i) and &zgr;(i). For the r-stage hierarchical organization, partitioning procedures are proposed which are based on the iterative application of the two-stage techniques.