Time- and space-optimality in B-trees

  • Authors:
  • Arnold L. Rosenberg;Lawrence Snyder

  • Affiliations:
  • IBM Thomas J. Watson Research Center, Yorktown Heights, NY;Yale Univ., New Haven, CT

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 1981

Quantified Score

Hi-index 0.00

Visualization

Abstract

A B-tree is compact if it is minimal in number of nodes, hence has optimal space utilization, among equally capacious B-trees of the same order. The space utilization of compact B-trees is analyzed and compared with that of noncompact B-trees and with (node)-visit-optimal B-trees, which minimize the expected number of nodes visited per key access. Compact B-trees can be as much as a factor of 2.5 more space efficient than visit-optimal B-trees; and the node-visit cost of a compact tree is never more than 1 + the node-visit cost of an optimal tree. The utility of initializing a B-tree to be compact (which initialization can be done in time linear in the number of keys if the keys are presorted) is demonstrated by comparing the space utilization of a compact tree that has been augmented by random insertions with that of a tree that has been grown entirely by random insertions. Even after increasing the number of keys by a modest amount, the effects of compact initialization are still felt. Once the tree has grown so large that these effects are no longer discernible, the tree can be expeditiously compacted in place using an algorithm presented here; and the benefits of compactness resume.