Approximating Optimal Binary Decision Trees

  • Authors:
  • Micah Adler;Brent Heeringa

  • Affiliations:
  • Department of Computer Science, University of Massachusetts, Amherst MA 01003;Department of Computer Science, Williams College, Williamstown MA 01267

  • Venue:
  • APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We give a (ln n+ 1)-approximation for the decision tree (DT) problem. An instance of DT is a set of mbinary tests T= (T1, ..., Tm) and a set of nitems X= (X1, ..., Xn). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the total external path length of the tree is minimized. Total external path length is the sum of the depths of all the leaves in the tree. DT has a long history in computer science with applications ranging from medical diagnosis to experiment design. It also generalizes the problem of finding optimal average-case search strategies in partially ordered sets which includes several alphabetic tree problems. Our work decreases the previous upper bound on the approximation ratio by a constant factor. We provide a new analysis of the greedy algorithm that uses a simple accounting scheme to spread the cost of a tree among pairs of items split at a particular node. We conclude by showing that our upper bound also holds for the DT problem with weighted tests.