Polychotomic Encoding: A Better Quasi-Optimal Bit-Vector Encoding of Tree Hierarchies

  • Authors:
  • Robert E. Filman

  • Affiliations:
  • -

  • Venue:
  • ECOOP '02 Proceedings of the 16th European Conference on Object-Oriented Programming
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

Polychotomic Encoding is an algorithm for producing bit vector encodings of trees. Polychotomic Encoding is an extension of the Dichotomic Encoding algorithm of Raynaud and Thierry. Polychotomic and Dichotomic Encodings are both examples of hierarchical encoding algorithms, where each node in the tree is given a gene--a subset of {1, . . . , n}. The encoding of each node is then the union of that node's gene with the genes of its ancestors. Reachability in the tree can then be determined by subset testing on the encodings.Dichotomic Encoding restructures the given tree into a binary tree, and then assigns two bit, incompatible (chotomic) "genes" to each of the two children of a node. Polychotomic Encoding substitutes a multibit encoding for the children of a node when the restructuring operation of Dichotomic Encoding would produce a new heaviest child (child requiring the most bits to represent a tree of its children) for that node. The paper includes a proof that Polychotomic Encoding never produces an encoding using more bits than Dichotomic Encoding. Experimentally, Polychotomic Encoding produces a space savings of up to 15% on examples of naturally occurring hierarchies, and 25% on trees in the randomly generated test set.