Query automata over finite trees

  • Authors:
  • Frank Neven;Thomas Schwentick

  • Affiliations:
  • Univ. Campus, Diepenbeek, Belgium;Philipps-Univ. Marburg, Marburg, Germany

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2002

Quantified Score

Hi-index 5.23

Visualization

Abstract

A main task in document transformation and information retrieval is locating subtrees satisfying some pattern. Therefore, unary queries, i.e., queries that map a tree to a set of its nodes, play an important role in the context of structured document databases. The motivation of this work is to understand how the natural and well-studied computation model of tree automata can be used to compute such queries. We define a query automaton (QA) as a deterministic two-way finite automaton over trees that has the ability to select nodes depending on the state and the label at those nodes. We study QAs over ranked as well as over unranked trees. Unranked trees differ from ranked ones in that there is no bound on the number of children of nodes. We characterize the expressiveness of the different formalisms as the unary queries definable in monadic second-order logic (MSO). In contrast to the ranked case, special stay transitions had to be added to QAs over unranked trees to capture MSO. We establish the complexity of the non-emptiness, containment, and equivalence of QAs to be complete for EXPTIME.