The tree inclusion problem: In linear space and faster

  • Authors:
  • Philip Bille;Inge Li Gortz

  • Affiliations:
  • Technical University of Denmark;Technical University of Denmark

  • Venue:
  • ACM Transactions on Algorithms (TALG)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given two rooted, ordered, and labeled trees P and T the tree inclusion problem is to determine if P can be obtained from T by deleting nodes in T. This problem has recently been recognized as an important query primitive in XML databases. Kilpeläinen and Mannila [1995] presented the first polynomial-time algorithm using quadratic time and space. Since then several improved results have been obtained for special cases when P and T have a small number of leaves or small depth. However, in the worst case these algorithms still use quadratic time and space. Let nS, lS, and dS denote the number of nodes, the number of leaves, and the depth of a tree S ∈ P, T. In this article we show that the tree inclusion problem can be solved in space O(nT) and time: O⎛⎝min⎧⎨⎩lPnTlPlT log log nT + nTnPnTlog nT+ nT log nT⎫⎬⎭⎞⎠. This improves or matches the best known time complexities while using only linear space instead of quadratic. This is particularly important in practical applications, such as XML databases, where the space is likely to be a bottleneck.