Partial alphabetic trees

  • Authors:
  • Arye Barkan;Haim Kaplan

  • Affiliations:
  • School of Computer Science, Faculty of exact sciences, Tel-Aviv University, Tel Aviv, Israel;School of Computer Science, Faculty of exact sciences, Tel-Aviv University, Tel Aviv, Israel

  • Venue:
  • Journal of Algorithms
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the partial alphabetic tree problem we are given a multiset of non-negative weights W = {w1,...,wn}, partitioned into m ≤ n blocks B1,...,Bm. We want to find a binary tree T where the elements of W reside in its leaves such that if we traverse the leaves from left to right then all leaves of Bi precede all leaves of Bj for every i . Furthermore among all such trees, T has to minimize Σi=1n wid(wi), where d(wi) is the depth of wi in T. The partial alphabetic tree problem generalizes the problem of finding a Huffman tree over W (there is only one block) and the problem of finding a minimum cost alphabetic tree over W (each block consists of a single item). This problem arises when we need an optimal binary code for a set of items with known frequencies, such that we have a lexicographic restriction for some of the codewords.Our main result is a pseudo-polynomial time algorithm that finds the optimal tree. Our algorithm runs in O((Wsum/Wmin)2α log(Wsum/Wmin)n2) time where Wsum = Σi=1n wi, Wmin = mini wi, and α = 1/logφ ≈ 1.44 (φ) = (√{5+1})/2 ≈ 1.618 is the golden ratio). In particular the running time is polynomial in case the weights are bounded by a polynomial of n. To bound the running time of our algorithm we prove an upper bound of ⌊αlog(Wsum/Wmin)+0.56⌋ on the depth of the optimal tree.Our algorithm relies on a solution to what we call the layered Huffman forest problem which is of independent interest. In the layered Huffman forest problem we are given an unordered multiset of weights W = {W1,...,Wn}, and a multiset of integers D = {d1,...,dk}. We look for a forest F with k trees, T1,...,Tk, where the weights in W correspond to the leaves of F, that minimizes Σi=1n widF (wi) where dF(wi) is the depth of wi in its tree plus dj if wi ∈ Tj. Our algorithm for this problem runs in O(kn2) time.